Reduce locking when persisting ML job statistics.#141519
Reduce locking when persisting ML job statistics.#141519jan-elastic merged 8 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/ml-core (Team:ML) |
|
Hi @jan-elastic, I've created a changelog YAML for you. |
|
Hi @jan-elastic, I've updated the changelog YAML for you. |
|
|
||
| public synchronized DataCounts runningTotalStats() { | ||
| public DataCounts runningTotalStats() { | ||
| totalRecordStats.calcProcessedFieldCount(getAnalysedFieldsPerRecord()); |
There was a problem hiding this comment.
Note: this "update the object upon read" is a bit strange.
I've considered to:
- move
analysedFieldsPerRecordtoDataCounts - make the
calcProcessedFieldCountprivate - call it when you call
setXyz()(Xyzis something that influencesprocessFieldCount)
What do you think about this?
I didn't do it, because I wanted the changes to be as minimal as possible.
There was a problem hiding this comment.
We don't make this change because this would increase the overhead of the method reportRecordWritten() by about 20%. This method lies on the hot path and is executed thousands of times per minute. It should be kept as fast as possible, even at the cost of a more cumbersome design.
| synchronized (this) { | ||
| unreportedStats = statsToReport; | ||
| } | ||
| unreportedStats.set(statsToReport); |
There was a problem hiding this comment.
The following race scenario between finishReporting and writeUnreportedCounts is possible:
Thread A (finishReporting):
1. unreportedStats.set(null) // Clear unreported stats
2. statsToReport = runningTotalStats()
3. persistDataCounts(..., false) → returns false (previous persist in progress)
Thread B (writeUnreportedCounts):
4. getAndSet(null) → returns null // Sees null, does nothing
Thread A:
5. unreportedStats.set(statsToReport) // Sets stats that should have been persisted
Result: Stats are in unreportedStats but won't be persisted until the next writeUnreportedCounts() call
This is not very bad, since the next flushJob() will persist them (datafeeds flush regularly, typically every few minutes). But it's probably worth documenting at this point that we guarantee only eventual consistency.
|
|
||
| public synchronized DataCounts runningTotalStats() { | ||
| public DataCounts runningTotalStats() { | ||
| totalRecordStats.calcProcessedFieldCount(getAnalysedFieldsPerRecord()); |
There was a problem hiding this comment.
We don't make this change because this would increase the overhead of the method reportRecordWritten() by about 20%. This method lies on the hot path and is executed thousands of times per minute. It should be kept as fast as possible, even at the cost of a more cumbersome design.
bcf196d to
8579849
Compare
jan-elastic
left a comment
There was a problem hiding this comment.
Unit test LGTM.
valeriy42
left a comment
There was a problem hiding this comment.
LGTM. Thank you for fixing this!
1f2078f to
1c74a01
Compare
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Reduce locking when persisting ML job statistics. * Update docs/changelog/141519.yaml * Update docs/changelog/141519.yaml * add unit test * improved timeout exception handling. * count down latch on exception * fix javadoc in unittest * [CI] Auto commit changes from spotless --------- Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
fixes #140511