-
Notifications
You must be signed in to change notification settings - Fork 987
Don't split ranges for refresh after the threshold #8560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Don't split ranges for refresh after the threshold #8560
Conversation
|
@melihmutlu, @mkindahl: please review this pull request.
|
061587a to
445141b
Compare
55dd0f2 to
09e04a2
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8560 +/- ##
==========================================
+ Coverage 82.41% 82.49% +0.08%
==========================================
Files 248 248
Lines 47676 47650 -26
Branches 12117 12113 -4
==========================================
+ Hits 39290 39309 +19
- Misses 3497 3508 +11
+ Partials 4889 4833 -56 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
99f5313 to
039cfb7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
039cfb7 to
a7b4dc8
Compare
a7b4dc8 to
0fb5144
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if you don't want to cap the beginning as well, and have tests for that.
| int64 invalidation_threshold = | ||
| invalidation_threshold_get(cagg->data.raw_hypertable_id, refresh_window.type); | ||
| refresh_window.end = MIN(invalidation_threshold, refresh_window.end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to cap the refresh window at the beginning as well? In theory, the refresh window could be entirely after the threshold, in which case you will cap the end, but not the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I need to think more about it... probably what is necessary here is cap at the MAX(time_dimension) of the hypertable and not the threshold. The idea here is to optimize the batches creation to don't create unnecessary batches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I followed that, but my point is that if you have a refresh window 10:00-11:00 and you have a threshold at 09:00, if you just cap the refresh window at the end you will get the range 10:00-09:00, which is not a valid window at all and could lead to strange behavior later in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct but what I meant is that the invalidation threshold should not be the best information for this optmization. This is more or less what I want to optimize:
[-------------------) - refresh window
[---------] - original hypertable data
|XXXX|....|....|XXXX| - batches produced without align refresh window with ht data
we can see the first and last batch will do nothing
[---------) - align new refresh window within hypertable data
|....|....| - produce batches (1..N buckets) in the aligned refresh window
This PR requires more work because the original hypertable can be another CAgg in case of hierarchical and it can be realtime :-/
Cap the refresh window at the hypertable invalidation threshold to don't produce unnecessary batches for refresh.
0fb5144 to
7d78190
Compare
|
This pull request has been automatically marked as stale due to lack of activity. This pull request will be closed in 30 days. |
Cap the refresh window at the hypertable invalidation threshold to don't produce unnecessary batches for refresh.
Disable-check: force-changelog-file