Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the encoding for the current OEM code page when reading zips #68

Merged
merged 1 commit into from
Feb 21, 2021
Merged

Conversation

Xanfre
Copy link
Contributor

@Xanfre Xanfre commented Feb 21, 2021

When extracting zipped FM archives currently, some characters in filenames, notably non-ASCII German characters, are not recognized properly, resulting in their extracted filenames being incorrect and potentially causing problems with resource loading. For example, see the recently released Sinister Night, which has the non-ASCII 'ü' (a lowercase 'u' with an umlaut) in the filename of one of its included object models, "Rüstung.bin".

This pull request changes procedures involving zip extraction to use the encoding for the current OEM code page when reading zip entries instead of the current ANSI page, which is the default for a ZipArchive if no other encoding is specified. This change matches the behavior of 7-Zip and, consequently, other modern FM loaders. However, this should only be done when reading zip archives, not writing them. With ZipArchives, when reading archives the specified encoding is only used for entries where the language encoding flag in the general purpose bit flag of the file header is not set, while it is always used when writing archives.

@FenPhoenix
Copy link
Owner

You're right, the current code gets it wrong. Your solution works for Sinister Night on my end. Although my initial thought was that using the current culture's encoding is a really bad idea (results will be different on different users' computers if their culture is different), some quick research shows this seems to be what all zip programs do, presumably because it's the best that can be done what with zip-creating programs often writing "code page 437" as "code page whatever we got lying around". So I'll merge it. Thanks for the catch.

@FenPhoenix FenPhoenix closed this Feb 21, 2021
@FenPhoenix FenPhoenix reopened this Feb 21, 2021
@FenPhoenix FenPhoenix merged commit cf8dd2f into FenPhoenix:master Feb 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants