Read more

How to fix: Corrupt special characters in ZIPs on Linux

Arne Hartherz
July 18, 2018Software engineer at makandra GmbH

When you receive a ZIP file from a Windows user, umlauts and other non-latin1 characters in filenames may look corrupt, and probably will be corrupt when extracting the ZIP file.

Illustration UI/UX Design

UI/UX Design by makandra brand

We make sure that your target audience has the best possible experience with your digital product. You get:

  • Design tailored to your audience
  • Proven processes customized to your needs
  • An expert team of experienced designers
Read more Show archive.org snapshot

The reason is encoding: Such archives are probably using Codepage 850. I am serious, 1987 is calling.

Fortunately, the unzip command can handle such files like so:

unzip -O CP850 file.zip

Interestingly enough, Rubyzip also compresses files that way. Probably so files look alright to Windows users.

Posted by Arne Hartherz to makandra dev (2018-07-18 19:04)