Skip to content

Quantize ST_X, ST_Y and related functions#140963

Merged
craigtaverner merged 14 commits intoelastic:mainfrom
craigtaverner:quantize_st_xy
Jan 23, 2026
Merged

Quantize ST_X, ST_Y and related functions#140963
craigtaverner merged 14 commits intoelastic:mainfrom
craigtaverner:quantize_st_xy

Conversation

@craigtaverner
Copy link
Contributor

@craigtaverner craigtaverner commented Jan 20, 2026

Recent optimizations in geo-grid functions exposed some inconsistencies in other spatial functions that can return results either quantized (doc-values or lucene index) or not (source).

ES|QL typically returns fields from doc-values, for performance reasons. However, for geospatial point data, this means a slight loss of precision, because geo_point and cartesian_point are quantized from two doubles (128bits) down to one long (64bits), in both doc-values and the lucene index (but not stored fields, or source). For all real-world use cases the remaining precision is fine, and something most users are willing to trade for the performance advantages. However, that willingness usually only extends to analytics, and if the user simply returns the original field, they usually want to see the exact original values. For this reason geospatial data is always returned from source in ES|QL, at a huge performance hit. However, we have implemented a number of optimizations that try to make use of doc-values whenever possible, and whenever the user does not return the original points so they will not see the precision loss. As we've expanded the scope of these optimizations, we've encountered a BWC issue with two particular functions, ST_X and ST_Y (both GA), and a less concerning issue with a group of related functions (tech-preview): ST_ENVELOPE, ST_XMAX, ST_XMIN, ST_YMAX and ST_YMIN.

This PR fixes the inconsistency, making ST_X, ST_Y and the envelope functions all produce quantized results, so optimizations do not result in different values.

Users can still get the original values with full prevision simply by not dropping the original geometry field, which will still be read from source. This gives users control over the precision-vs-performance lever. Just drop the original field to maximize performance, or keep it to see the original precision.

Fixes #139943

@craigtaverner craigtaverner added >bug :Analytics/Geo Indexing, search aggregations of geo points and shapes Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL branch:9.2 v9.4.0 branch:9.3 labels Jan 20, 2026
@elasticsearchmachine
Copy link
Collaborator

Hi @craigtaverner, I've created a changelog YAML for you.

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@craigtaverner craigtaverner marked this pull request as ready for review January 21, 2026 13:21
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

@@ -137,6 +137,24 @@ protected NodeInfo<? extends Expression> info() {
}

static void buildEnvelopeResults(BytesRefBlock.Builder results, Rectangle rectangle) {
Copy link
Contributor

@iverase iverase Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this method buildCartesinanEnvelopeResults? it is confusing as it is now. And probably we can add a method that takes a encoder to avoid duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. The missing Cartesian problem was in many places (notably evaluators), so I fixed them all by doing a general cleanup making sure all evaluators have the correct Geo/Cartesian naming, as well as putting it always before the WKB/DocValues (some were after). I'll push this, and then work on reducing code duplication, which is also in more places than just this one method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I've entirely removed duplicated code, by also using the pattern we used for sharing state in StGeotile. We create the visitor once per thread, instead of once per page, making fewer objects (the same optimization we used in the geo-grid functions). And I made sure we used the embedded SpatialCoordinateTypes inside the resultsBuilder, so we did not need multiple methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I was therefor also able to reduce the total number of generated evaluator classes. I imagine I could use this trick elsewhere to cut the generated code down a bit. But for now I think this is enough for this PR!

@craigtaverner craigtaverner added the auto-backport Automatically create backport pull requests when merged label Jan 21, 2026
craigtaverner and others added 7 commits January 22, 2026 12:50
…cValues

This increases similarity and consistency and reduces confusion
Also removed incorrect multiple quantization for doc-values results
We were using a supplier for the results builder, but mistakenly passing in a shared points visitor, instead of making a new one when the builder was initiated.
Copy link
Contributor

@ncordon ncordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just left a couple of questions

The spatial types `geo_point`, `geo_shape`, `cartesian_point` and `cartesian_shape` are maintained at source precision in the original documents,
but indexed at reduced precision by Lucene, for performance reasons.
To ensure this optimization is available in the widest context, all [spatial functions](/reference/query-languages/esql/functions-operators/spatial-functions.md) will produce results
at this reduced precision, aligned with the underlying Lucene index grid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we explaining what's the Lucene grid in the docs somewhere before this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I understood we described this in the Query DSL and ingest docs, but should actually confirm this. And make links. I've discussed briefly with the docs team, and we want to re-review these docs next week and do a followup PR with any touch-ups.

greaterThan(0)
);
for (int column = 1; column < 8; column++) {
if (index > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't understand why this condition is here so I commented it out and tests pass. Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is mostly comparing all indexes to index 0 (which is the fully indexed case). So uncommenting that line will compare index 0 to itself, which is fine, but redundant. See line 186 for where we get index 0 from as the baseline to compare against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I see your point. This test was based on logic in other tests that have only one column and compare all indexes to the first index. But this test also compares other columns to column 0, and that comparison is valid for index 0 also. So that line should be removed, as you suggest. Pity this PR is already merged! A followup would make sense.

Since ST_X and ST_Y return slightly different precision in older versions, mixed clusters will return no results due to precision differences in the WHERE clause.
We could consider relaxing equality for spatial types, but that is expensive (needs to convert WKB to Points, quantize and then back to WKB).
@craigtaverner craigtaverner merged commit a380117 into elastic:main Jan 23, 2026
34 of 35 checks passed
craigtaverner added a commit to craigtaverner/elasticsearch that referenced this pull request Jan 23, 2026
Recent [optimizations in geo-grid functions](elastic#138917) exposed some inconsistencies in other spatial functions that can return results either quantized (doc-values or lucene index) or not (source).

ES|QL typically returns fields from doc-values, for performance reasons. However, for geospatial point data, this means a slight loss of precision, because geo_point and cartesian_point are quantized from two doubles (128bits) down to one long (64bits), in both doc-values and the lucene index (but not stored fields, or source). For all real-world use cases the remaining precision is fine, and something most users are willing to trade for the performance advantages. However, that willingness usually only extends to analytics, and if the user simply returns the original field, they usually want to see the exact original values. For this reason geospatial data is always returned from source in ES|QL, at a huge performance hit. However, we have implemented a number of optimizations that try to make use of doc-values whenever possible, and whenever the user does not return the original points so they will not see the precision loss. As we've expanded the scope of these optimizations, we've encountered a BWC issue with two particular functions, ST_X and ST_Y (both GA), and a less concerning issue with a group of related functions (tech-preview): ST_ENVELOPE, ST_XMAX, ST_XMIN, ST_YMAX and ST_YMIN.

This PR fixes the inconsistency, making ST_X, ST_Y and the envelope functions all produce quantized results, so optimizations do not result in different values.

Users can still get the original values with full prevision simply by not dropping the original geometry field, which will still be read from source. This gives users control over the precision-vs-performance lever. Just drop the original field to maximize performance, or keep it to see the original precision.
elasticsearchmachine pushed a commit that referenced this pull request Jan 24, 2026
Recent [optimizations in geo-grid functions](#138917) exposed some inconsistencies in other spatial functions that can return results either quantized (doc-values or lucene index) or not (source).

ES|QL typically returns fields from doc-values, for performance reasons. However, for geospatial point data, this means a slight loss of precision, because geo_point and cartesian_point are quantized from two doubles (128bits) down to one long (64bits), in both doc-values and the lucene index (but not stored fields, or source). For all real-world use cases the remaining precision is fine, and something most users are willing to trade for the performance advantages. However, that willingness usually only extends to analytics, and if the user simply returns the original field, they usually want to see the exact original values. For this reason geospatial data is always returned from source in ES|QL, at a huge performance hit. However, we have implemented a number of optimizations that try to make use of doc-values whenever possible, and whenever the user does not return the original points so they will not see the precision loss. As we've expanded the scope of these optimizations, we've encountered a BWC issue with two particular functions, ST_X and ST_Y (both GA), and a less concerning issue with a group of related functions (tech-preview): ST_ENVELOPE, ST_XMAX, ST_XMIN, ST_YMAX and ST_YMIN.

This PR fixes the inconsistency, making ST_X, ST_Y and the envelope functions all produce quantized results, so optimizations do not result in different values.

Users can still get the original values with full prevision simply by not dropping the original geometry field, which will still be read from source. This gives users control over the precision-vs-performance lever. Just drop the original field to maximize performance, or keep it to see the original precision.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL :Analytics/Geo Indexing, search aggregations of geo points and shapes auto-backport Automatically create backport pull requests when merged backport pending >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.5 v9.3.1 v9.4.0

4 participants