Skip to content

Avoid data loss in dedup_public.py when a link cannot be created#1020

Merged
Xpirix merged 1 commit into
qgis:mainfrom
Valyrian-Code:fix/dedup-public-dataloss
Jul 1, 2026
Merged

Avoid data loss in dedup_public.py when a link cannot be created#1020
Xpirix merged 1 commit into
qgis:mainfrom
Valyrian-Code:fix/dedup-public-dataloss

Conversation

@Valyrian-Code

Copy link
Copy Markdown
Contributor

Problem

scripts/dedup_public.py replaces a file that is byte-identical to a base-language file with a link to that canonical copy. It did so by unlinking the original before creating the replacement:

p.unlink()
if args.mode == 'symlink':
    os.symlink(rel_target, p)
else:
    os.link(canonical, p)

If os.symlink/os.link then fails (disk, permissions, cross-device for hardlinks, etc.), the file is left deleted with no replacement and is lost permanently. The except only prints; it cannot restore the file. This script runs during the production deploy (build_prod_incremental.sh / build_prod_per_lang.sh), so a single failure mid-run drops a published asset.

Fix

Create the link at a temporary path first, then atomically os.replace it over the original. A failed link now leaves the original file untouched.

Test

Adds test/test_dedup_public.py (builds on the pytest harness from #1015):

  • injects an os.symlink failure and asserts the original file survives intact;
  • a success case asserting the duplicate becomes a symlink resolving to the canonical bytes, with no temp file left behind.

Reverting the fix makes the first test fail (the file is lost), confirming it has teeth.

When replacing a duplicate file with a link to the canonical copy, the
script unlinked the original before creating the symlink/hardlink. If
os.symlink or os.link then failed, the file was left deleted with no
replacement and was lost permanently. Create the link at a temporary path
and atomically os.replace it over the original instead, so a failed link
leaves the original file intact. This script runs during the production
deploy (build_prod_incremental.sh / build_prod_per_lang.sh).

Add a regression test that injects an os.symlink failure and asserts the
original file survives, plus a success case.
Copilot AI review requested due to automatic review settings June 15, 2026 06:43

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

@Xpirix Xpirix left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great fix @Valyrian-Code , thanks! LGTM

@Xpirix Xpirix merged commit f0d491e into qgis:main Jul 1, 2026
4 checks passed
@Valyrian-Code Valyrian-Code deleted the fix/dedup-public-dataloss branch July 1, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants