How to fix: Corrupt special characters in ZIPs on Linux

Updated . Posted . Visible to the public.

When you receive a ZIP file from a Windows user, umlauts and other non-latin1 characters in filenames may look corrupt, and probably will be corrupt when extracting the ZIP file.

The reason is encoding: Such archives are probably using Codepage 850. I am serious, 1987 is calling.

Fortunately, the unzip command can handle such files like so:

unzip -O CP850 file.zip

Interestingly enough, Rubyzip also compresses files that way. Probably so files look alright to Windows users.

Arne Hartherz
Last edit
Arne Hartherz
License
Source code in this card is licensed under the MIT License.
Posted by Arne Hartherz to makandra dev (2018-07-18 17:04)