Parsing ZIP File Uploads

Jul 11, 2021 · 2 min read · 323 Words · -Views -Comments

I recently needed to upload ZIP files: read on the frontend, send to the backend, write to a ZIP file, then extract. During integration, errors indicated the ZIP was invalid.

From the error, the written byte stream was clearly wrong.

Current code

I checked the current ZIP read code below.

Frontend

   const reader = new FileReader();
      reader.readAsText(file, 'UTF-8'); 
      reader.onload = (evt) => {
        setUploadContent(Base64.encode(evt.target.result));
      };

Backend

The backend Go writing code was roughly:

	decContractsSource, err := base64.StdEncoding.DecodeString(h.Req.ContractsSourceBase64)
	if err != nil {
		msg := fmt.Sprintf("base64 DecodeString error %v", err)
		seelog.Errorf(msg)
		h.SetBaseResponse(apiCommon.ErrCodeInternalError, msg)
		return
	}
	content := []byte(decContractsSource)
	err = ioutil.WriteFile("hellooworld.zip", content, 0644)
	if err != nil {
		msg := fmt.Sprintf("WriteFile error, AppId{%v} %v", h.Req.AppId, err)
		seelog.Errorf(msg)
		h.SetBaseResponse(apiCommon.ErrCodeInternalError, msg)
		return
	}

The backend just decodes base64 and writes bytes to a ZIP.

So the likely issue is on the frontend read side.

About ZIP

Quoted from Wikipedia:

ZIP is an archive file format that supports lossless data compression.

In JS, reader.readAsText is for text files. ZIP is binary, so reading as text corrupts bytes. That’s the bug.

FileReader APIs

  • FileReader.readAsDataURL()

    • Reads files and base64-encodes them
  • FileReader.readAsText()

    • Text files
  • FileReader.readAsArrayBuffer()

    • Binary array
  • FileReader.readAsBinaryString() is non-standard and deprecated

For ZIPs, do not use readAsText. Use readAsDataURL instead, and strip the MIME prefix data:application/zip;base64,.

After rewriting, the test passed.

Demo

  • To verify, I made a small demo.

  • FileReader.readAsDataURL() works for both text and binary (e.g., ZIP) and can fully replace readAsText here.

Key code blocks:

const reader = new FileReader();
reader.readAsDataURL(file); 
···
fileContent = evt.target.result.replace(/^(data:[a-z-\/]+;base64,)/, '')
···

···
// backend
const buff = new Buffer(req.body.file, 'base64');
fs.writeFileSync(`./test.${req.body.fileType === 'zip' ? 'zip' : 'txt'}`, buff);
···

Other gotchas

I limited uploads to .zip, but on macOS Chrome, .xlsx was still accepted — because .xlsx itself is a ZIP container. Tighten client‑side checks accordingly.

Final Thoughts

  • Compression reduces A bytes to B bytes via algorithms; plain text isn’t compressed. Reading binary as text shortens/corrupts content, hence the invalid ZIP.
Authors
Developer, digital product enthusiast, tinkerer, sharer, open source lover