Skip to content

Remove old temporary checkpoints #2726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 23, 2020
Merged

Conversation

csmarchbanks
Copy link
Contributor

If an ingester crashes while doing a checkpoint, which is likely as
checkpoints are almost always happening, then a checkpoint like
checkpoint.123456.tmp will be left on disk. When the next checkpoint
succeeds, the old .tmp checkpoints are not deleted even though it will
never be used. After enough crashes it is possible to fill the entire
disk with .tmp checkpoints, bringing down an ingester.

Checklist

  • Tests updated
  • Documentation added (Not needed)
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
@csmarchbanks csmarchbanks force-pushed the remove-old-tmp-checkpoints branch from 772af9e to 1ebd865 Compare June 15, 2020 16:10
@pstibrany
Copy link
Contributor

/cc @codesome

@csmarchbanks csmarchbanks force-pushed the remove-old-tmp-checkpoints branch 2 times, most recently from 6db4264 to 7fd5907 Compare June 17, 2020 14:15
Copy link
Contributor

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a non-blocker comment, as I don't expect users to tinker with the WAL directory and change the names.

Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-approving (Assuming comment from Ganesh is fixed, eg. by using regexp.QuoteMeta on checkpointPrefix).

If an ingester crashes while doing a checkpoint, which is likely as
checkpoints are almost always happening, then a checkpoint like
checkpoint.123456.tmp will be left on disk. When the next checkpoint
succeeds, the old .tmp checkpoints are not deleted even though it will
never be used. After enough crashes it is possible to fill the entire
disk with .tmp checkpoints, bringing down an ingester.

Signed-off-by: Chris Marchbanks <csmarchbanks@gmail.com>
@csmarchbanks csmarchbanks force-pushed the remove-old-tmp-checkpoints branch from 7fd5907 to 44d2040 Compare June 22, 2020 18:18
@csmarchbanks
Copy link
Contributor Author

@pstibrany Thanks for pointing out regexp.QuoteMeta, I didn't know that function existed! Addressed the comment, added a test case, and will merge soon unless someone opposes.

@csmarchbanks csmarchbanks merged commit 876a8f5 into master Jun 23, 2020
@csmarchbanks csmarchbanks deleted the remove-old-tmp-checkpoints branch June 23, 2020 00:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants