Skip to content

Fix TDigestState.read CB leaks#114303

Merged
ivancea merged 8 commits intoelastic:mainfrom
ivancea:unmute-esqlactionbreakerit
Oct 11, 2024
Merged

Fix TDigestState.read CB leaks#114303
ivancea merged 8 commits intoelastic:mainfrom
ivancea:unmute-esqlactionbreakerit

Conversation

@ivancea
Copy link
Contributor

@ivancea ivancea commented Oct 8, 2024

Closes #114194

Fixes TDigestState.read() CB leak on error on .reserve().

@ivancea ivancea added >test Issues or PRs that are addressing/adding tests Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.16.0 v9.0.0 labels Oct 8, 2024
@ivancea ivancea marked this pull request as ready for review October 9, 2024 15:30
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@ivancea ivancea changed the title Unmute EsqlActionBreakerIT test suite Oct 10, 2024
@ivancea ivancea requested a review from bpintea October 10, 2024 12:47
@ivancea
Copy link
Contributor Author

ivancea commented Oct 10, 2024

Just fixed the issue. Or an issue at least, but it makes sense that it was this because it's on read(), and it was only happening in cluster tests apparently. And in a very specific case (CB failing exactly in the digest.reserve() call inside read())

@ivancea ivancea requested review from kkrik-es and nik9000 October 10, 2024 12:55
@ivancea ivancea changed the title Unmute EsqlActionBreakerIT test suite and fix TDigestState.read CB leaks Oct 10, 2024
@ivancea ivancea added >bug and removed >test Issues or PRs that are addressing/adding tests labels Oct 10, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @ivancea, I've created a changelog YAML for you.

}
for (int i = 0; i < n; i++) {
state.add(in.readDouble(), in.readVLong());
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a try / finally block and use a boolean variable to control the outcome of the execution as we do in other places. See

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Also, changed the internal catch to a finally too. merging both try-finally in one made the code a bit more complex, so left them separated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you, that's looks good.

try (TDigestState example = TDigestState.createOfType(newLimitedBreaker(ByteSizeValue.ofMb(100)), digestType, 100)) {
return TDigestState.createUsingParamsFrom(example);
}
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use the cranky breaker service it'll blow up randomly and you can assert that you always freed the memory, no matter the random explosion. I rather like it for this sort of thing.

Which one of your tests hits a the breaker exception you are guarding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one of your tests hits a the breaker exception you are guarding?

The testReadWithData(). The reserve is only called when it has data in it. So I added that one explicitly.

About the cranky breaker, I'm not sure we want it to break randomly. This has multiple potential breaking points I wanted to test (Add memory and release on error, on every constructor).

The cranky breaks 1/20 times. If we're fine with this one (The full suite takes around 1 second on my machine), I would leave it as-is. It tests every potential breaking point (Divided in 16 random seeds).
Is it too much? Not sure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the cranky breaker, I'm not sure we want it to break randomly. This has multiple potential breaking points I wanted to test (Add memory and release on error, on every constructor).

That's where the cranky is really good at.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updated it to use the cranky breaker. It will fail ~1/5 times, so I guess it's ok. I suppose even 1/20 times would be ok for something like this.
I guess I was reinventing the wheel here XD

@ivancea ivancea requested review from iverase and nik9000 October 10, 2024 15:13
Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ivancea ivancea merged commit 5fcfa48 into elastic:main Oct 11, 2024
@ivancea ivancea deleted the unmute-esqlactionbreakerit branch October 11, 2024 07:41
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Oct 13, 2024
Closes elastic#114194

Fixes `TDigestState.read()` CB leak on error on `.reserve()`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.16.0 v9.0.0

5 participants