Skip to content

Fix pipeline resolution cache for bulk requests#144648

Merged
chrisberkhout merged 3 commits intoelastic:mainfrom
chrisberkhout:fix-bulk-upsert-use-of-wrong-pipeline
Apr 17, 2026
Merged

Fix pipeline resolution cache for bulk requests#144648
chrisberkhout merged 3 commits intoelastic:mainfrom
chrisberkhout:fix-bulk-upsert-use-of-wrong-pipeline

Conversation

@chrisberkhout
Copy link
Copy Markdown
Contributor

Corrects the key for resolvedPipelineCache to avoid getting the wrong pipeline, which was happening for bulk upserts when two actions have null index values in their index requests but different indexes in their containing action requests.

The change in time handling is mostly for style. Without it, the first time used to compute the cached pipeline resolution for a given date math index name expression would effectively apply for later cases of that expression in the batch. Fixing the timestamp makes that behavior explicit.

The bug was introduced with the new caching logic in #116031.

@chrisberkhout chrisberkhout self-assigned this Mar 20, 2026
@chrisberkhout chrisberkhout added >bug :StorageEngine/Data streams Data streams and their lifecycles auto-backport Automatically create backport pull requests when merged labels Mar 20, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine elasticsearchmachine added Team:StorageEngine v9.4.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Mar 20, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @chrisberkhout, I've created a changelog YAML for you.

@chrisberkhout chrisberkhout force-pushed the fix-bulk-upsert-use-of-wrong-pipeline branch from b04a666 to 97f51dc Compare March 20, 2026 13:45
@dakrone dakrone added :Distributed/Ingest Node Execution or management of Ingest Pipelines and removed :StorageEngine/Data streams Data streams and their lifecycles labels Mar 20, 2026
@elasticsearchmachine elasticsearchmachine added Team:Distributed Meta label for distributed team. and removed Team:StorageEngine labels Mar 20, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@adrianchen-es
Copy link
Copy Markdown

@chrisberkhout Thank you very much for this.
When this is merged, could we please have it backported? Given that the bug was introduced in 8.15.4+ and not every customer would be on 9+ due to the various breaking changes to move from 8.18/8.19 -> 9+

Thank you.

@chrisberkhout chrisberkhout force-pushed the fix-bulk-upsert-use-of-wrong-pipeline branch from 97f51dc to 3fc2d24 Compare April 10, 2026 13:37
@chrisberkhout chrisberkhout requested review from a team and removed request for parkertimmins April 10, 2026 13:37
@chrisberkhout chrisberkhout force-pushed the fix-bulk-upsert-use-of-wrong-pipeline branch from 3fc2d24 to 0e99dd5 Compare April 17, 2026 10:18
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 17, 2026

🔍 Preview links for changed docs

⏳ Building and deploying preview... View progress

This comment will be updated with preview links when the build is complete.

@github-actions
Copy link
Copy Markdown
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Copy link
Copy Markdown
Contributor

@szybia szybia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't fault it

Comment thread docs/changelog/144648.yaml Outdated
@chrisberkhout chrisberkhout force-pushed the fix-bulk-upsert-use-of-wrong-pipeline branch from 6a7516a to 90ea0a1 Compare April 17, 2026 13:07
@chrisberkhout chrisberkhout merged commit 8160933 into elastic:main Apr 17, 2026
35 checks passed
chrisberkhout added a commit to chrisberkhout/elasticsearch that referenced this pull request Apr 17, 2026
Corrects the key for resolvedPipelineCache to avoid getting the wrong
pipeline, which was happening for bulk upserts when two actions have
null index values in their index requests but different indexes in their
containing action requests.

The change in time handling is mostly for style. Without it, the first
time used to compute the cached pipeline resolution for a given date
math index name expression would effectively apply for later cases of
that expression in the batch. Fixing the timestamp makes that behavior
explicit.

The bug was introduced with the new caching logic in elastic#116031.
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

Status Branch Result
9.2
8.19 Commit could not be cherrypicked due to conflicts
9.3
9.4

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 144648

chrisberkhout added a commit to chrisberkhout/elasticsearch that referenced this pull request Apr 17, 2026
Corrects the key for resolvedPipelineCache to avoid getting the wrong
pipeline, which was happening for bulk upserts when two actions have
null index values in their index requests but different indexes in their
containing action requests.

The change in time handling is mostly for style. Without it, the first
time used to compute the cached pipeline resolution for a given date
math index name expression would effectively apply for later cases of
that expression in the batch. Fixing the timestamp makes that behavior
explicit.

The bug was introduced with the new caching logic in elastic#116031.
chrisberkhout added a commit to chrisberkhout/elasticsearch that referenced this pull request Apr 17, 2026
Corrects the key for resolvedPipelineCache to avoid getting the wrong
pipeline, which was happening for bulk upserts when two actions have
null index values in their index requests but different indexes in their
containing action requests.

The change in time handling is mostly for style. Without it, the first
time used to compute the cached pipeline resolution for a given date
math index name expression would effectively apply for later cases of
that expression in the batch. Fixing the timestamp makes that behavior
explicit.

The bug was introduced with the new caching logic in elastic#116031.
@chrisberkhout
Copy link
Copy Markdown
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Apr 17, 2026
Corrects the key for resolvedPipelineCache to avoid getting the wrong
pipeline, which was happening for bulk upserts when two actions have
null index values in their index requests but different indexes in their
containing action requests.

The change in time handling is mostly for style. Without it, the first
time used to compute the cached pipeline resolution for a given date
math index name expression would effectively apply for later cases of
that expression in the batch. Fixing the timestamp makes that behavior
explicit.

The bug was introduced with the new caching logic in #116031.
elasticsearchmachine pushed a commit that referenced this pull request Apr 17, 2026
Corrects the key for resolvedPipelineCache to avoid getting the wrong
pipeline, which was happening for bulk upserts when two actions have
null index values in their index requests but different indexes in their
containing action requests.

The change in time handling is mostly for style. Without it, the first
time used to compute the cached pipeline resolution for a given date
math index name expression would effectively apply for later cases of
that expression in the batch. Fixing the timestamp makes that behavior
explicit.

The bug was introduced with the new caching logic in #116031.
elasticsearchmachine pushed a commit that referenced this pull request Apr 17, 2026
Corrects the key for resolvedPipelineCache to avoid getting the wrong
pipeline, which was happening for bulk upserts when two actions have
null index values in their index requests but different indexes in their
containing action requests.

The change in time handling is mostly for style. Without it, the first
time used to compute the cached pipeline resolution for a given date
math index name expression would effectively apply for later cases of
that expression in the batch. Fixing the timestamp makes that behavior
explicit.

The bug was introduced with the new caching logic in #116031.
szybia added a commit to szybia/elasticsearch that referenced this pull request Apr 17, 2026
* upstream/main: (38 commits)
  Determine inference timeout based on task type (elastic#146089)
  Adds NQT functions and linearCombination to ESVectorUtil (elastic#146435)
  ESQL: Add PagedBytesBuilder (elastic#146094)
  ESQL: Remove  match_only_text from "store" setting candidates for generative random mappings tests (elastic#146633)
  YAML test: support all_nodes for cluster_features (elastic#146505)
  [ESQL] Filtered aggregate pushdown for external sources (elastic#146597)
  SecurityMigrationExecutor & SystemIndexMigrationExecutor opt-out of automaticReassignmentOnShutdown (elastic#145753)
  [ES|QL] Lookup join and Inline stats support for query approximation (elastic#145980)
  Fix NPE in GPU resource pool when CuVSResources creation fails (elastic#146632)
  Fix pipeline resolution cache for bulk requests (elastic#144648)
  Mute org.elasticsearch.index.reindex.ReindexRelocationOnShutdownIT testReindexFailsWhenPitRelocationFails elastic#146650
  Unmute 18 tests after fixing SearchHits leak in MergeResultWireCompatibilityTopHitsTests (elastic#146574)
  Allow the zstd tests to be run independently (elastic#146637)
  [Docs] Add cross-link to Query activity UI from ES|QL task management page (elastic#145397)
  ESQL: Clean up unmapped_fields validation tests (elastic#146629)
  Implement value fetching and block loading for semantic field (elastic#146552)
  ESQL: Re-mute testApproximationDisabled (elastic#146638)
  WriteLoadConstraintDecider: Include the max write-load proportion in the explain string (elastic#146452)
  Reindexing stateful relocation integration test (elastic#145976)
  Clarify inactive shard grace period naming (elastic#146522)
  ...
chrisberkhout added a commit that referenced this pull request Apr 21, 2026
…6661)

* Fix pipeline resolution cache for bulk requests (#144648)

Corrects the key for resolvedPipelineCache to avoid getting the wrong
pipeline, which was happening for bulk upserts when two actions have
null index values in their index requests but different indexes in their
containing action requests.

The change in time handling is mostly for style. Without it, the first
time used to compute the cached pipeline resolution for a given date
math index name expression would effectively apply for later cases of
that expression in the batch. Fixing the timestamp makes that behavior
explicit.

The bug was introduced with the new caching logic in #116031.

(cherry picked from commit 8160933)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >bug :Distributed/Ingest Node Execution or management of Ingest Pipelines external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Distributed Meta label for distributed team. v8.19.15 v9.2.9 v9.3.4 v9.4.1 v9.5.0

6 participants