Skip to content

Deep copy BytesRef when creating a constant vector block#141242

Merged
dnhatn merged 3 commits intoelastic:mainfrom
dnhatn:eval-constant-block
Jan 29, 2026
Merged

Deep copy BytesRef when creating a constant vector block#141242
dnhatn merged 3 commits intoelastic:mainfrom
dnhatn:eval-constant-block

Conversation

@dnhatn
Copy link
Member

@dnhatn dnhatn commented Jan 26, 2026

Currently, when creating a constant BytesRef block or vector from a constant input, we do not deep copy the BytesRef in the evaluator. As a result, if the evaluator generates additional blocks afterwards, these blocks may share the underlying bytes with the constant block, leading to incorrect data. This issue was seen in tests where two rows had the same data when they should not.

For example:

return driverContext.blockFactory().newConstantBytesRefBlockWith(evalValue(vector, 0, scratchPad), positionCount);

and

return driverContext.blockFactory().newConstantBytesRefBlockWith(evalValue(vector, 0, scratchPad), positionCount);

In these cases, a constant BytesRef is created from the same bytesView of BreakingBytesRefBuilder.

For other vector/block types, we already copy the BytesRef to a new buffer, but this is not done for constant blocks. This change ensures that we always deep copy the input BytesRef when creating a constant vector or block. While in some cases the copy may not be necessary, the overhead is minimal and this approach is safer than requiring callers to handle the copy selectively.

Closes #140809
Closes #140621
Closes #140615

@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn dnhatn added v8.19.11 v9.3.1 v9.2.5 auto-backport Automatically create backport pull requests when merged labels Jan 26, 2026
@dnhatn dnhatn requested review from mouhc1ne and nik9000 January 26, 2026 01:45
@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've updated the changelog YAML for you.

@dnhatn dnhatn marked this pull request as ready for review January 26, 2026 01:45
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 26, 2026
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@dnhatn
Copy link
Member Author

dnhatn commented Jan 29, 2026

Thanks Nik!

@dnhatn dnhatn merged commit 82c221b into elastic:main Jan 29, 2026
35 checks passed
@dnhatn dnhatn deleted the eval-constant-block branch January 29, 2026 16:26
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jan 29, 2026
)

Currently, when creating a constant BytesRef block or vector from a 
constant input, we do not deep copy the BytesRef in the evaluator. As a
result, if the evaluator generates additional blocks afterwards, these
blocks may share the underlying bytes with the constant block, leading
to incorrect data. This issue was seen in tests where two rows had the
same data when they should not.

Closes elastic#140809
Closes elastic#140621
Closes elastic#140615
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.3
8.19 Commit could not be cherrypicked due to conflicts
9.2

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 141242

dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jan 29, 2026
)

Currently, when creating a constant BytesRef block or vector from a 
constant input, we do not deep copy the BytesRef in the evaluator. As a
result, if the evaluator generates additional blocks afterwards, these
blocks may share the underlying bytes with the constant block, leading
to incorrect data. This issue was seen in tests where two rows had the
same data when they should not.

Closes elastic#140809
Closes elastic#140621
Closes elastic#140615
elasticsearchmachine pushed a commit that referenced this pull request Jan 29, 2026
…141531)

Currently, when creating a constant BytesRef block or vector from a 
constant input, we do not deep copy the BytesRef in the evaluator. As a
result, if the evaluator generates additional blocks afterwards, these
blocks may share the underlying bytes with the constant block, leading
to incorrect data. This issue was seen in tests where two rows had the
same data when they should not.

Closes #140809
Closes #140621
Closes #140615
elasticsearchmachine pushed a commit that referenced this pull request Jan 29, 2026
…141530)

Currently, when creating a constant BytesRef block or vector from a 
constant input, we do not deep copy the BytesRef in the evaluator. As a
result, if the evaluator generates additional blocks afterwards, these
blocks may share the underlying bytes with the constant block, leading
to incorrect data. This issue was seen in tests where two rows had the
same data when they should not.

Closes #140809
Closes #140621
Closes #140615
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.19.12 v9.2.6 v9.3.1 v9.4.0

3 participants