Skip to content

Conversation

@jimhark
Copy link

@jimhark jimhark commented Jul 21, 2025

Summary

codec.js/decode() may calculate encryptedDataHash multiple times. calculatedHMAC is based on encryptedDataHash which can be arbitrarily large. Recalculating this hash when the data hasn't changed is a performance bug. The fix is to calculate the hash once. While reviewing this change the use of recursion was replaced with explicit retries.

Fixes

Fixes #226

Details

  • Calculate and reuse encryptedDataHash to avoid rehashing
  • Unwind the use of recursion to simplify the code and improve readability
  • function is actually shorter (LoC AND lines of text)

Testing

I manually tested node encrypt, node decrypt, and HTML wrapper decrypt.
It's not clear to me exactly how I could test the backwards compatible password hashes. Additional testing is recommended.

Notes for Reviewers

Multiple hashing has been around a while, but it was not easy to address before PR #219. I'd say at that point the performance issue became a bug because it was fixable.

Probably variable names 'hashedPassword2' and 'hashedPassword3' could be improved. I assigned them in the order they were used, but:

const hashedPassword2 = await cryptoEngine.hashThirdRound(hashedPassword, salt);
let hashedPassword3 = await cryptoEngine.hashSecondRound(hashedPassword, salt);

May be confusing.

jimhark and others added 6 commits July 20, 2025 20:14
This is part of a push to support larger files. The focus is the switch
to using Uint8Array to store binary data. But also includes:

- When running on Node, use Buffer.from() for hex string conversions.

- To avoid large buffer copy, signedMsg as been replaced by an object
    containing  iv, encrypted, and hmac.

- hmac calculation has changed so it avoids copying (possibly very
    large) encrypted data. See signDigest() in lib/codec.js.

- Minor cleanup

Handling hex encode/decode at the input/output boundaries and using
Uint8Array internally for representing binary data has these benefits:

- More memory efficient, allows processing of 2x larger files.

- Aligns with cryptographic best practices: hashing is now performed
  on raw binary data (Uint8Array) instead of hex strings.

- Behavior is (mostly) unchanged
  - scripts/index_template.html textContent is not implemented and
    needs to be redesigned.
makes unnamed function more self documenting
Also removed use of recursion to improve readability (and debugability).
As a bonus, function is actually shorter (LoC AND lines of text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant