[Transform] Fix transform producing empty dest index when source query references runtime fields#142450
Conversation
|
Hi @valeriy42, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug where transforms produce an empty destination index when the source query references runtime fields. The issue occurred because OpenPointInTimeRequest does not support runtime_mappings, causing queries on runtime fields to be rewritten as match_none during shard filtering, resulting in an empty PIT context.
Changes:
- Skip PIT index filter optimization when runtime mappings are present in the source config
- Propagate runtime mappings in the initial progress search query
- Propagate runtime mappings in the
sourceHasChangedcheckpoint check
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| ClientTransformIndexer.java | Conditionally omits PIT index filter when runtime mappings exist to prevent empty search results |
| TransformIndexer.java | Adds runtime mappings to the initial progress search request |
| TimeBasedCheckpointProvider.java | Adds runtime mappings to the source change detection search |
| ClientTransformIndexerTests.java | Adds tests verifying PIT index filter behavior with and without runtime mappings |
| TimeBasedCheckpointProviderTests.java | Adds test verifying runtime mappings are included in source change detection |
| 142450.yaml | Documents the bug fix in the changelog |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…er creation and add a new test in TransformIndexerTests to verify initial progress search includes runtime mappings.
|
Pinging @elastic/ml-core (Team:ML) |
💔 Backport failed
You can use sqren/backport to manually backport by running |
…y references runtime fields (elastic#142450) When a transform's source query references a runtime field (e.g., a range filter on a runtime-mapped field), the destination index is produced empty even though _transform/_preview returns correct results. The root cause is that ClientTransformIndexer.injectPointInTimeIfNeeded() passes the source query as the PIT indexFilter, but OpenPointInTimeRequest does not support runtime_mappings. During the can_match phase, the query on the unknown runtime field is rewritten to match_none, all shards are filtered out, and the PIT opens with zero search contexts — causing every subsequent search to return empty results. Preview is unaffected because it uses a normal SearchSourceBuilder with both the query and runtime_mappings and does not use PIT. This change skips the PIT index filter optimization when the source config has non-empty runtime_mappings, since the filter cannot resolve runtime fields. The PIT is still opened for snapshot consistency, just without shard pre-filtering. Additionally, two other code paths that use the source query without propagating runtime_mappings are fixed: the initial progress search in TransformIndexer and the sourceHasChanged check in TimeBasedCheckpointProvider. All three fixes follow the same principle — wherever the source query is used, runtime_mappings from the source config must be propagated alongside it. Fixes elastic#113156 (cherry picked from commit cf28432) # Conflicts: # x-pack/plugin/transform/src/test/java/org/elasticsearch/xpack/transform/transforms/ClientTransformIndexerTests.java
…y references runtime fields (elastic#142450) When a transform's source query references a runtime field (e.g., a range filter on a runtime-mapped field), the destination index is produced empty even though _transform/_preview returns correct results. The root cause is that ClientTransformIndexer.injectPointInTimeIfNeeded() passes the source query as the PIT indexFilter, but OpenPointInTimeRequest does not support runtime_mappings. During the can_match phase, the query on the unknown runtime field is rewritten to match_none, all shards are filtered out, and the PIT opens with zero search contexts — causing every subsequent search to return empty results. Preview is unaffected because it uses a normal SearchSourceBuilder with both the query and runtime_mappings and does not use PIT. This change skips the PIT index filter optimization when the source config has non-empty runtime_mappings, since the filter cannot resolve runtime fields. The PIT is still opened for snapshot consistency, just without shard pre-filtering. Additionally, two other code paths that use the source query without propagating runtime_mappings are fixed: the initial progress search in TransformIndexer and the sourceHasChanged check in TimeBasedCheckpointProvider. All three fixes follow the same principle — wherever the source query is used, runtime_mappings from the source config must be propagated alongside it. Fixes elastic#113156 (cherry picked from commit cf28432) # Conflicts: # x-pack/plugin/transform/src/test/java/org/elasticsearch/xpack/transform/transforms/ClientTransformIndexerTests.java
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…y references runtime fields (elastic#142450) When a transform's source query references a runtime field (e.g., a range filter on a runtime-mapped field), the destination index is produced empty even though _transform/_preview returns correct results. The root cause is that ClientTransformIndexer.injectPointInTimeIfNeeded() passes the source query as the PIT indexFilter, but OpenPointInTimeRequest does not support runtime_mappings. During the can_match phase, the query on the unknown runtime field is rewritten to match_none, all shards are filtered out, and the PIT opens with zero search contexts — causing every subsequent search to return empty results. Preview is unaffected because it uses a normal SearchSourceBuilder with both the query and runtime_mappings and does not use PIT. This change skips the PIT index filter optimization when the source config has non-empty runtime_mappings, since the filter cannot resolve runtime fields. The PIT is still opened for snapshot consistency, just without shard pre-filtering. Additionally, two other code paths that use the source query without propagating runtime_mappings are fixed: the initial progress search in TransformIndexer and the sourceHasChanged check in TimeBasedCheckpointProvider. All three fixes follow the same principle — wherever the source query is used, runtime_mappings from the source config must be propagated alongside it. Fixes elastic#113156 (cherry picked from commit cf28432) # Conflicts: # x-pack/plugin/transform/src/test/java/org/elasticsearch/xpack/transform/transforms/ClientTransformIndexerTests.java
…e query references runtime fields (#142450) (#142830) * [Transform] Fix transform producing empty dest index when source query references runtime fields (#142450) When a transform's source query references a runtime field (e.g., a range filter on a runtime-mapped field), the destination index is produced empty even though _transform/_preview returns correct results. The root cause is that ClientTransformIndexer.injectPointInTimeIfNeeded() passes the source query as the PIT indexFilter, but OpenPointInTimeRequest does not support runtime_mappings. During the can_match phase, the query on the unknown runtime field is rewritten to match_none, all shards are filtered out, and the PIT opens with zero search contexts — causing every subsequent search to return empty results. Preview is unaffected because it uses a normal SearchSourceBuilder with both the query and runtime_mappings and does not use PIT. This change skips the PIT index filter optimization when the source config has non-empty runtime_mappings, since the filter cannot resolve runtime fields. The PIT is still opened for snapshot consistency, just without shard pre-filtering. Additionally, two other code paths that use the source query without propagating runtime_mappings are fixed: the initial progress search in TransformIndexer and the sourceHasChanged check in TimeBasedCheckpointProvider. All three fixes follow the same principle — wherever the source query is used, runtime_mappings from the source config must be propagated alongside it. Fixes #113156 (cherry picked from commit cf28432) # Conflicts: # x-pack/plugin/transform/src/test/java/org/elasticsearch/xpack/transform/transforms/ClientTransformIndexerTests.java * fix compilation issues
…ce query references runtime fields (#142450) (#142831) * [Transform] Fix transform producing empty dest index when source query references runtime fields (#142450) When a transform's source query references a runtime field (e.g., a range filter on a runtime-mapped field), the destination index is produced empty even though _transform/_preview returns correct results. The root cause is that ClientTransformIndexer.injectPointInTimeIfNeeded() passes the source query as the PIT indexFilter, but OpenPointInTimeRequest does not support runtime_mappings. During the can_match phase, the query on the unknown runtime field is rewritten to match_none, all shards are filtered out, and the PIT opens with zero search contexts — causing every subsequent search to return empty results. Preview is unaffected because it uses a normal SearchSourceBuilder with both the query and runtime_mappings and does not use PIT. This change skips the PIT index filter optimization when the source config has non-empty runtime_mappings, since the filter cannot resolve runtime fields. The PIT is still opened for snapshot consistency, just without shard pre-filtering. Additionally, two other code paths that use the source query without propagating runtime_mappings are fixed: the initial progress search in TransformIndexer and the sourceHasChanged check in TimeBasedCheckpointProvider. All three fixes follow the same principle — wherever the source query is used, runtime_mappings from the source config must be propagated alongside it. Fixes #113156 (cherry picked from commit cf28432) # Conflicts: # x-pack/plugin/transform/src/test/java/org/elasticsearch/xpack/transform/transforms/ClientTransformIndexerTests.java * Fix compilation errors in transform tests Update SourceConfig constructor calls to match new signature (removed 4th parameter) and remove CrossProjectModeDecider from TransformServices constructor. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
…e query references runtime fields (#142450) (#142829) * [Transform] Fix transform producing empty dest index when source query references runtime fields (#142450) When a transform's source query references a runtime field (e.g., a range filter on a runtime-mapped field), the destination index is produced empty even though _transform/_preview returns correct results. The root cause is that ClientTransformIndexer.injectPointInTimeIfNeeded() passes the source query as the PIT indexFilter, but OpenPointInTimeRequest does not support runtime_mappings. During the can_match phase, the query on the unknown runtime field is rewritten to match_none, all shards are filtered out, and the PIT opens with zero search contexts — causing every subsequent search to return empty results. Preview is unaffected because it uses a normal SearchSourceBuilder with both the query and runtime_mappings and does not use PIT. This change skips the PIT index filter optimization when the source config has non-empty runtime_mappings, since the filter cannot resolve runtime fields. The PIT is still opened for snapshot consistency, just without shard pre-filtering. Additionally, two other code paths that use the source query without propagating runtime_mappings are fixed: the initial progress search in TransformIndexer and the sourceHasChanged check in TimeBasedCheckpointProvider. All three fixes follow the same principle — wherever the source query is used, runtime_mappings from the source config must be propagated alongside it. Fixes #113156 (cherry picked from commit cf28432) # Conflicts: # x-pack/plugin/transform/src/test/java/org/elasticsearch/xpack/transform/transforms/ClientTransformIndexerTests.java * Fix compilation errors in transform tests Update SourceConfig constructor calls to match new signature (removed 4th parameter) and remove CrossProjectModeDecider from TransformServices constructor. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
…esent (#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise
…esent (elastic#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](elastic#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise
…esent (elastic#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](elastic#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise
…esent (elastic#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](elastic#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise
… are present (#142452) (#143170) * [Transform] Skip checkpoint query filter when runtime_mappings are present (#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise * fix build error
…are present (#142452) (#143171) * [Transform] Skip checkpoint query filter when runtime_mappings are present (#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise * fix build error
…are present (#142452) (#143169) * [Transform] Skip checkpoint query filter when runtime_mappings are present (#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise * fix build error
…esent (elastic#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](elastic#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise
… are present (#142452) (#143170) * [Transform] Skip checkpoint query filter when runtime_mappings are present (#142452) ## Problem When a transform's source query references runtime fields, `DefaultCheckpointProvider.getIndexCheckpoints` passes the query to `GetCheckpointAction.Request`, which builds a `SearchShardsRequest` for shard-level `can_match` filtering. Neither `GetCheckpointAction.Request` nor `SearchShardsRequest` supports `runtime_mappings`, so the `search_shards` API fails every time. The graceful fallback in `TransportGetCheckpointAction` (lines 137-145) catches this failure, logs a **warning**, and falls back to unfiltered shard resolution. This means: - A warning-level log message fires on **every checkpoint** when runtime fields + query filter are used together - The shard-skipping optimization is lost anyway (all shards queried on failure) - No data correctness bug, but noisy logs in production ## Approach Apply the same pattern used for the [PIT `indexFilter` fix](#142450): when `runtime_mappings` are present, pass `null` as the query to `GetCheckpointAction.Request` (skipping the shard-filtering optimization cleanly rather than failing and falling back). This is the simplest fix because: - Propagating `runtime_mappings` through `GetCheckpointAction.Request` -> `SearchShardsRequest` -> `TransportSearchShardsAction` would require changes to core server classes and wire serialization — a much larger, riskier change for a minor optimization - The end result is identical (all shards are queried), just without the error/warning noise * fix build error
When a transform's source query references a runtime field (e.g., a
rangefilter on a runtime-mapped field), the destination index is produced empty even though_transform/_previewreturns correct results. The root cause is thatClientTransformIndexer.injectPointInTimeIfNeeded()passes the source query as the PITindexFilter, butOpenPointInTimeRequestdoes not supportruntime_mappings. During thecan_matchphase, the query on the unknown runtime field is rewritten tomatch_none, all shards are filtered out, and the PIT opens with zero search contexts — causing every subsequent search to return empty results. Preview is unaffected because it uses a normalSearchSourceBuilderwith both the query andruntime_mappingsand does not use PIT.This change skips the PIT index filter optimization when the source config has non-empty
runtime_mappings, since the filter cannot resolve runtime fields. The PIT is still opened for snapshot consistency, just without shard pre-filtering. Additionally, two other code paths that use the source query without propagatingruntime_mappingsare fixed: the initial progress search inTransformIndexerand thesourceHasChangedcheck inTimeBasedCheckpointProvider. All three fixes follow the same principle — wherever the source query is used,runtime_mappingsfrom the source config must be propagated alongside it.Fixes #113156