Read more

How to fix: Corrupt special characters in ZIPs on Linux

Arne Hartherz
July 18, 2018Software engineer at makandra GmbH

When you receive a ZIP file from a Windows user, umlauts and other non-latin1 characters in filenames may look corrupt, and probably will be corrupt when extracting the ZIP file.

Illustration web development

Do you need DevOps-experts?

Your development team has a full backlog? No time for infrastructure architecture? Our DevOps team is ready to support you!

  • We build reliable cloud solutions with Infrastructure as code
  • We are experts in security, Linux and databases
  • We support your dev team to perform
Read more Show archive.org snapshot

The reason is encoding: Such archives are probably using Codepage 850. I am serious, 1987 is calling.

Fortunately, the unzip command can handle such files like so:

unzip -O CP850 file.zip

Interestingly enough, Rubyzip also compresses files that way. Probably so files look alright to Windows users.

Posted by Arne Hartherz to makandra dev (2018-07-18 19:04)