mattermost/server/channels/utils/unicode_test.go
135yshr 314ed3756a
Some checks are pending
API / build (push) Waiting to run
Server CI / Compute Go Version (push) Waiting to run
Server CI / Check mocks (push) Blocked by required conditions
Server CI / Check go mod tidy (push) Blocked by required conditions
Server CI / check-style (push) Blocked by required conditions
Server CI / Check serialization methods for hot structs (push) Blocked by required conditions
Server CI / Vet API (push) Blocked by required conditions
Server CI / Check migration files (push) Blocked by required conditions
Server CI / Generate email templates (push) Blocked by required conditions
Server CI / Check store layers (push) Blocked by required conditions
Server CI / Check mmctl docs (push) Blocked by required conditions
Server CI / Postgres with binary parameters (push) Blocked by required conditions
Server CI / Postgres (push) Blocked by required conditions
Server CI / Postgres (FIPS) (push) Blocked by required conditions
Server CI / Generate Test Coverage (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
Fix import failures for Japanese filenames with dakuten on macOS (#35204)
* 🐛 fix: normalize Unicode filenames in import attachment lookup

Fix import failures for files with Japanese dakuten/handakuten characters
(e.g., ガ, パ, べ) on macOS.

macOS stores filenames in NFD (decomposed) form while Linux/Windows use
NFC (composed) form. This mismatch caused attachment lookup failures
when zip filenames and JSONL paths used different normalization forms.

Changes:
- Add NormalizeFilename utility function using golang.org/x/text/unicode/norm
- Normalize filenames when building attachment maps from zip files
- Normalize paths when looking up attachments in maps
- Apply fixes to both server (import.go) and mmctl (validate.go)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* avoid duplicating normalizeFilename

* add coverage for Korean filenames

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Jesse Hallam <jesse@mattermost.com>
Co-authored-by: Mattermost Build <build@mattermost.com>
2026-03-18 12:16:55 +00:00

124 lines
3.3 KiB
Go

// Copyright (c) 2015-present Mattermost, Inc. All Rights Reserved.
// See LICENSE.txt for license information.
package utils
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestNormalizeFilename(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{
name: "ASCII only",
input: "test.jpg",
expected: "test.jpg",
},
{
name: "Japanese katakana dakuten NFC",
input: "\u30AC", // ガ (NFC)
expected: "\u30AC",
},
{
name: "Japanese katakana dakuten NFD",
input: "\u30AB\u3099", // カ + combining dakuten → ガ
expected: "\u30AC",
},
{
name: "Japanese katakana handakuten NFC",
input: "\u30D1", // パ (NFC)
expected: "\u30D1",
},
{
name: "Japanese katakana handakuten NFD",
input: "\u30CF\u309A", // ハ + combining handakuten → パ
expected: "\u30D1",
},
{
name: "Japanese hiragana dakuten NFC",
input: "\u3079", // べ (NFC)
expected: "\u3079",
},
{
name: "Japanese hiragana dakuten NFD",
input: "\u3078\u3099", // へ + combining dakuten → べ
expected: "\u3079",
},
{
name: "Mixed path with NFD",
input: "data/\u30AB\u3099test.jpg", // data/カ゛test.jpg
expected: "data/\u30ACtest.jpg", // data/ガtest.jpg
},
{
name: "Complex Japanese filename NFD",
input: "\u304B\u3099\u304D\u3099\u3050", // が + ぎ + ぐ (NFD: か゛き゛く゛)
expected: "\u304C\u304E\u3050", // がぎぐ (NFC)
},
{
name: "Path with multiple NFD characters",
input: "data/\u30D5\u309A\u30ED\u30B7\u3099\u30A7\u30AF\u30C8.png", // data/プロジェクト.png (NFD)
expected: "data/\u30D7\u30ED\u30B8\u30A7\u30AF\u30C8.png", // data/プロジェクト.png (NFC)
},
{
name: "Empty string",
input: "",
expected: "",
},
{
name: "Already NFC normalized",
input: "ファイル名.txt",
expected: "ファイル名.txt",
},
{
name: "Korean Hangul NFC",
input: "\uAC00", // 가 (NFC precomposed)
expected: "\uAC00",
},
{
name: "Korean Hangul NFD",
input: "\u1100\u1161", // ᄀ + ᅡ (NFD Jamo) → 가
expected: "\uAC00",
},
{
name: "Korean word NFD",
input: "\u1112\u1161\u11AB\u1100\u1173\u11AF", // 한글 (NFD Jamo: 한글)
expected: "\uD55C\uAE00", // 한글 (NFC)
},
{
name: "Korean filename with path NFD",
input: "data/\u1111\u1161\u110B\u1175\u11AF.txt", // 파일 (NFD) + .txt
expected: "data/\uD30C\uC77C.txt", // 파일.txt (NFC)
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := NormalizeFilename(tt.input)
assert.Equal(t, tt.expected, result)
})
}
}
func TestNormalizeFilenameIdempotent(t *testing.T) {
// NFC normalization should be idempotent
inputs := []string{
"test.jpg",
"\u30AC", // ガ (NFC)
"\u30AB\u3099", // カ + combining dakuten (NFD)
"data/テスト.jpg",
"\uD55C\uAE00", // 한글 (NFC)
"\u1112\u1161\u11AB\u1100\u1173\u11AF", // 한글 (NFD Jamo)
"",
}
for _, input := range inputs {
first := NormalizeFilename(input)
second := NormalizeFilename(first)
assert.Equal(t, first, second, "NormalizeFilename should be idempotent for input: %q", input)
}
}