Use the encoding for the current OEM code page when reading zips #68

Xanfre · 2021-02-21T09:58:49Z

When extracting zipped FM archives currently, some characters in filenames, notably non-ASCII German characters, are not recognized properly, resulting in their extracted filenames being incorrect and potentially causing problems with resource loading. For example, see the recently released Sinister Night, which has the non-ASCII 'ü' (a lowercase 'u' with an umlaut) in the filename of one of its included object models, "Rüstung.bin".

This pull request changes procedures involving zip extraction to use the encoding for the current OEM code page when reading zip entries instead of the current ANSI page, which is the default for a ZipArchive if no other encoding is specified. This change matches the behavior of 7-Zip and, consequently, other modern FM loaders. However, this should only be done when reading zip archives, not writing them. With ZipArchives, when reading archives the specified encoding is only used for entries where the language encoding flag in the general purpose bit flag of the file header is not set, while it is always used when writing archives.

FenPhoenix · 2021-02-21T20:12:13Z

You're right, the current code gets it wrong. Your solution works for Sinister Night on my end. Although my initial thought was that using the current culture's encoding is a really bad idea (results will be different on different users' computers if their culture is different), some quick research shows this seems to be what all zip programs do, presumably because it's the best that can be done what with zip-creating programs often writing "code page 437" as "code page whatever we got lying around". So I'll merge it. Thanks for the catch.

Use the encoding for the OEM CP when reading zips

fa83716

FenPhoenix closed this Feb 21, 2021

FenPhoenix reopened this Feb 21, 2021

FenPhoenix merged commit cf8dd2f into FenPhoenix:master Feb 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the encoding for the current OEM code page when reading zips #68

Use the encoding for the current OEM code page when reading zips #68

Xanfre commented Feb 21, 2021

FenPhoenix commented Feb 21, 2021

Use the encoding for the current OEM code page when reading zips #68

Use the encoding for the current OEM code page when reading zips #68

Conversation

Xanfre commented Feb 21, 2021

FenPhoenix commented Feb 21, 2021