How to fix: Corrupt special characters in ZIPs on Linux

Posted Almost 6 years ago. Visible to the public.

When you receive a ZIP file from a Windows user, umlauts and other non-latin1 characters in filenames may look corrupt, and probably will be corrupt when extracting the ZIP file.

The reason is encoding: Such archives are probably using Codepage 850. I am serious, 1987 is calling.

Fortunately, the unzip command can handle such files like so:

unzip -O CP850 file.zip

Interestingly enough, Rubyzip also compresses files that way. Probably so files look alright to Windows users.

Arne Hartherz
Last edit
Almost 6 years ago
Arne Hartherz
License
Source code in this card is licensed under the MIT License.
Posted by Arne Hartherz to makandra dev (2018-07-18 17:04)