Skip to content

Conversation

@jimhark
Copy link

@jimhark jimhark commented Jul 21, 2025

Summary

Compress data before encryption because it reduces size by about 30%, at a minimal additional compute cost compared to encryption and signing. This can produce in an encrypted file that's smaller than the original.

Resolves

Resolves #230

Details

  • Compress before encryption / decompress after decryption
  • Use gzip compression
  • About 30% reduction observed

Testing

I manually tested node encrypt, node decrypt, and HTML wrapper decrypt.

Notes for Reviewers

In an attempt to avoid, or at least limit, having 3 representations of data simultaneously in memory (file data buffer, decrypted data buffer, decompressed data buffer), we sometimes pass a reader instead of a buffer. Reader here means a callable that can read the data. See 'msgReader' in codec.js, which provides this comment:

We take a message reader function instead of a message buffer so we can release its storage when it is no longer needed (caller isn't stuck holding a reference).

jimhark and others added 9 commits July 21, 2025 00:48
This is part of a push to support larger files. The focus is the switch
to using Uint8Array to store binary data. But also includes:

- When running on Node, use Buffer.from() for hex string conversions.

- To avoid large buffer copy, signedMsg as been replaced by an object
    containing  iv, encrypted, and hmac.

- hmac calculation has changed so it avoids copying (possibly very
    large) encrypted data. See signDigest() in lib/codec.js.

- Minor cleanup

Handling hex encode/decode at the input/output boundaries and using
Uint8Array internally for representing binary data has these benefits:

- More memory efficient, allows processing of 2x larger files.

- Aligns with cryptographic best practices: hashing is now performed
  on raw binary data (Uint8Array) instead of hex strings.

- Behavior is (mostly) unchanged
  - scripts/index_template.html textContent is not implemented and
    needs to be redesigned.
makes unnamed function more self documenting
Also removed use of recursion to improve readability (and debugability).
As a bonus, function is actually shorter (LoC AND lines of text)
cuts size by 1/3 and noticeably improves performance
Refactored encrypted buffer handling to reduce memory usage
Minor cleanup
@jimhark
Copy link
Author

jimhark commented Jul 23, 2025

Committed a396bf0 to fix line of code that got messed up between testing and commit/push.

@jimhark
Copy link
Author

jimhark commented Jul 23, 2025

Committed a396bf0 to remove to variable definitions for vars that are no longer used. This was discovered while writing a previous pull request comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant