Replies: 2 comments 1 reply
-
Hmm, maybe on Mac fd should convert the pattern to NFD before matching. I don't know how difficult that would be though, or if there would be any bad interactions with regex. And then what do you do if you want to match against files that aren't in NFD (like say from a disk that isn't a native mac file system)? |
Beta Was this translation helpful? Give feedback.
-
@tmccombs Indeed, just because "fd" is running on a Mac, you can't assume that the file system is always in NFD. It "generally" is, as I wrote in my original message, but there are exceptions:
"Apple’s old HFS+ file system stores file names in a variant of NFD (the newer APFS doesn’t enforce that anymore), and the macOS Finder still decomposes file names." It's a complex problem. This is why I didn't open a ticket, but simply opened a discussion. One possible solution might be to add an optional flag to "fd" for ignoring accents. This flag would match "café" as equivalent to "cafe", and "français" as equivalent to "francais". @tavianator "find" exhibits the same behaviour as "fd". I use "fd" exclusively nowadays, but yeah, I could've opened the dicussion in "find" instead. A StackExchange answer I found recommends using "iconv" inside "find":
It's a mess! |
Beta Was this translation helpful? Give feedback.
-
Mac computers generally store file names using a variant of "Normalization Form D (NFD)". This normalization form decomposes accented letters, like the letter "é" (\U00E9) into "e" + combining accute accent (e\U0301). So, when the user names a file "café" in Finder, by inputing "é" as a single character, internally the letter "é" will be stored as 2 characters, although this is generally unknown to the user. Because "fd" matches internal and not external representation, typing:
fd café
or
fd caf$'\U00E9'
doesn't get any matches. The only way for the user to get matches is to decompose the accented letters using "Normalization Form D (NFD)" logic, and type:
fd cafe$'\U0301'
which is non trivial even for developers.
This issue affects Mac users of "fd" using files with accented alphabets, like French. So, I guess the pool of users is limited. Still, it's worth knowing.
Beta Was this translation helpful? Give feedback.
All reactions