Improve no-op check in PUT _mapping API by nielsbauman · Pull Request #138367 · elastic/elasticsearch

nielsbauman · 2025-11-20T14:02:12Z

The existing no-op check in the PUT _mapping API is often ineffective as it only compares the mapping sources directly (with CompressedXContent.equals()). While that comparison is cheap, it doesn't account for the default values that the mapping service adds or removes when merging mappings. For example, "_data_stream_timestamp":{"enabled":true} is automatically added for data stream indices, and "ignore_malformed":false is removed if it is supplied in a mapping update request.

With this PR, we already perform the merge in the transport action to determine whether the request would be a no-op. By making this no-op check more accurate, we can avoid creating redundant cluster state update tasks. Since large mappings can make the merging of mappings expensive, avoiding cluster state update tasks is valuable to avoid instabilities in other operations of the cluster. We've seen cases of integrations retrying this API repeatedly if it doesn't seem to have the desired effect. While that is a bug in the integrations/client side, by having this improved no-op check, these mapping updates will have a significantly reduced impact on the stability of the cluster (i.e. less cluster state update tasks).

Parsing and merging mappings can be expensive, so we need to perform these checks on the management thread pool instead of the transport thread.

The existing code merges the mappings twice, where the first time is purely for mapping validation. This doesn't seem to have any value - perhaps it used to have value before. By removing this "dry run" mapping, we can save quite a bit of processing on the precious cluster state update task thread.

elasticsearchmachine · 2025-11-20T14:03:00Z

Hi @nielsbauman, I've created a changelog YAML for you.

nielsbauman · 2025-11-20T14:57:04Z

I did some local benchmarking on a single-node ES cluster (with assertions disabled) where I created 1000 single-index data streams, all with the same (considerably sized) mappings (sourced here). I then ran two different experiments on both main and this branch.

1) No-op updates

I called PUT my-ds-*/_mapping repeatedly with the exact same mappings that I used in the data stream's index template.

1) No-op updates - `main`

This call takes roughly 12 seconds on average and makes my ES node log:

[2025-11-18T16:15:17,388][WARN ][o.e.c.s.MasterService ] [runTask-0] took [23.3s/23355ms] to compute cluster state update for [put-mapping

The flamegraph (collected over 60 seconds):

flamegraph-mappings-main-noop-pr.html
The highlighted flames are MapperService#parseMapping calls. Note that all this work is being done on the cluster state update thread. You might be curious why we parse mappings three times. The reason is that we're parsing the new mappings twice, as I explained in this PR description, and that we're parsing the existing mappings of every index when we create the temporary MapperService instances. I think I have a way to avoid that last part, but I'll open a separate PR for that.

1) No-op updates - this PR

This call takes roughly 8 seconds on average and does not result in slow cluster state update logs.
The flamegraph (collected over 60 seconds):

flamegraph-mappings-pr-noop-pr.html
We now only parse twice and we do everything in the transport action, in the management threadpool.

2) Actual updates

I called PUT my-ds-*/_mapping repeatedly with slightly different mappings (I incremented ignore_above of one field on every call) to analyze how real (not no-op) updates would perform.

2) Actual updates - `main`

This call takes roughly 14 seconds on average and also results in slow cluster state update logs.
The flamegraph (collected over 60 seconds):

flamegraph-mappings-main-actual-changes-pr.html
The extra highlighted flame on the right is from the IndicesClusterStateService that is processing the actual mapping updates. We can ignore this path in this PR.

2) Actual updates - this PR

This call takes roughly 11 seconds on average and also results in slow cluster state update logs.
The flamegraph (collected over 60 seconds):

flamegraph-mappings-pr-actual-changes-pr.html
We can again ignore the rightmost highlighted flame from the IndicesClusterStateService. For some reason, the parsing and merging in the transport action barely show up in this flamegraph. I moved my cursor to that flame/path in the bottom left. I'm not really sure why only the cluster state update path shows up in the graph. Perhaps because that thread pool is single-threaded and the management one is multi-threaded, maybe that can influence how likely it is to show up in the graph?

Anyway, that is pretty much the downside of this PR; we still need to parse and merge all the mappings twice (once in the transport action and once in the cluster state update task). I think it's worth investigating whether we can "cache" the result of the first merge somehow and try to reuse that in the cluster state update task to avoid having to redo the mappings merge. We'd have to come up with a way to ensure that all "dependencies" have not changed (e.g. existing index mappings, index settings, etc.), to ensure we can actually reuse the merge from the transport action. In my opinion, this current PR is valuable on its own, so I decided to leave that as future work.

elasticsearchmachine · 2025-11-20T15:00:09Z

Pinging @elastic/es-search-foundations (Team:Search Foundations)

elasticsearchmachine · 2025-11-20T15:00:09Z

Pinging @elastic/es-data-management (Team:Data Management)

Copilot

Pull Request Overview

This PR enhances the no-op detection in the PUT _mapping API by performing a more accurate comparison that accounts for default values added or removed during mapping merges. The improved check prevents unnecessary cluster state update tasks when mapping updates would not result in actual changes.

Key changes:

Implement a pre-merge no-op check in MetadataMappingService.isWholeRequestNoop() that performs actual mapping merges to detect effective no-ops
Move mapping request processing to the MANAGEMENT thread pool to handle expensive parsing/merging operations
Remove redundant "dry run" mapping merge that was previously used solely for validation

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
MetadataMappingService.java	Adds `isWholeRequestNoop()` method with merge-based comparison; refactors `applyRequest()` to remove duplicate merge and improve efficiency
TransportPutMappingAction.java	Changes executor from `DIRECT_EXECUTOR_SERVICE` to `MANAGEMENT` thread pool
TransportAutoPutMappingAction.java	Changes executor from `DIRECT_EXECUTOR_SERVICE` to `MANAGEMENT` thread pool
docs/changelog/138367.yaml	Adds changelog entry for the enhancement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java

DaveCTurner · 2025-11-20T15:32:43Z

We'd have to come up with a way to ensure that all "dependencies" have not changed

There's a mappings version and a settings version which you can use to tell whether the mappings or settings have changed. If there's more to it, you could also reasonably keep a hold of the whole IndexMetadata and check it for instance equality in the cluster state update.

does not result in slow cluster state update logs

The logging is pretty coarse, but the relevant stats on cluster state update performance are at GET _nodes/_master/stats/discovery?filter_path=nodes.*.discovery.cluster_state_update.success&pretty. Does that help?

nielsbauman · 2025-11-20T16:49:09Z

Thanks, @DaveCTurner, that's useful. My main worry is that I'm not familiar enough with mappings to know what exactly could influence mappings. Are there other things in the cluster state that could influence mappings, such as cluster settings for instance? Obviously there are things like plugins that could change, but those won't change in between a transport action (on the master node) and the cluster state update task. I'll ask the Search Foundations team if they have an answer to those questions.

DaveCTurner · 2025-11-20T17:00:06Z

Ugh I thought it'd be simple but ofc it isn't. Looking at the code, it needs a MapperService, which depends on most of IndicesService, including ClusterService, so without going through the code in great detail it'll be pretty hard to answer. And future-proofing it will also be a concern (for that we could at least add some assertions to verify that the mapping we cache matches a freshly-generated mapping if assertions are enabled). Hopefully the Search Foundations folks can give a more confident answer

FWIW I'm considering alternative ideas around mapping updates, including imposing some kind of throttle on the tasks submitted to the master to avoid cases where thousands of updates take tens of seconds to process and hold everything else up.

szybia

don't have enough knowledge in this area to approve

but i can't spot anything wrong on a code-level

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java

dakrone

I think this generally looks good to me, I don't have any concerns, but I'd still feel better if someone from @elastic/es-search-management took a look since they own the mappings merging.

spinscale · 2025-12-10T08:39:41Z

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java

-                noop = false;
-                break;
+            // If the mapping sources are already equal, then we already know this index would be a no-op and can skip further checks.
+            if (request.source().equals(mappingMetadata.source())) {


Is this continue part here for readability?

Otherwise this could be merrged with the next part?

if (request.source().equals(mappingMetadata.source()) == false) { try (MapperService mapperService = indicesService.createIndexMapperServiceForValidation(indexMetadata)) { } }

Yep, I usually prefer using continue rather than creating deeper and deeper nesting - for readability indeed. If you have a strong preference for your suggestion, I can incorporate it.

Fine with me either way, no strong preferences :-) My way of reading this was "if.. break, if...break, if...break, if...continue" that stopped my flow, but your point is valid as well!

spinscale · 2025-12-10T08:41:50Z

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java

+            // We could consider caching the result on the request object to avoid doing the same work twice. This would require some
+            // checks to ensure the cached result is only used if the circumstances are the same (e.g., no changes to the index settings).
+            try (MapperService mapperService = indicesService.createIndexMapperServiceForValidation(indexMetadata)) {
+                mapperService.merge(indexMetadata, MergeReason.MAPPING_RECOVERY);


Is the merge reason here on purpose? I saw a code snippet in MapperService.newDocumentMapper() that uses this merge reason to enable/disable limit checks, thus the question.

That matches what we do here:

elasticsearch/server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java

Lines 128 to 131 in 5d418dc

MapperService mapperService = indicesService.createIndexMapperServiceForValidation(indexMetadata);

indexMapperServices.put(index, mapperService);

// add mappings for all types, we need them for cross-type validation

mapperService.merge(indexMetadata, MergeReason.MAPPING_RECOVERY);

We seem to do that for most (if not all) usages of createIndexMapperServiceForValidation. I'm afraid I know nothing of MergeReasons, so I simply copied that approach from existing sources. Let me know if there's a reason we shouldn't use it!

spinscale

I tried to come up with a test case in MetadataMappingServiceTests for the putMapping method, but I am not sure we can check for the isWholeRequestNoop() immediate acknowledgement to not trigger a cluster state update there. Idea would have been to have a keyword field ("field": { "type": "keyword" }) in an index and then for example use the default value of ignore_above as an update ("field": { "type": "keyword", "ignore_above": "2147483647" }). Or testing the isWholeRequestNoop method within the MetadataMappingServiceTests? Do you think that makes sense?

nielsbauman · 2025-12-11T08:49:02Z

We could always create a fresh MetadataMappingService instance with a mocked ClusterService and mocked/custom task queue. MetadataMappingService doesn't depend on many things so it's straightforward to create one. Then we verify that nothing is added to the queue.

But testing the isWholeRequestNoop method diretly is definitely also an option indeed. I'll play around with both and see which one seems best.

nielsbauman · 2025-12-11T12:16:39Z

@spinscale what do you think about aa6a984?

spinscale · 2025-12-11T13:28:45Z

server/src/test/java/org/elasticsearch/cluster/metadata/MetadataMappingServiceTests.java

+    /**
+     * Test that putting a mapping that is semantically identical but syntactically different results in a no-op.
+     */
+    public void test() throws IOException {


nit: naming.

Does it also make sense to add a test the verifies an interaction? Or is that covered by other tests anyway?

🤦 missed that one, good catch!

I indeed figured that the non-noop flow is sufficiently covered by other integration tests already (i.e. they verify that non-noop updates actually modify the cluster state).

Improve no-op check in PUT _mapping API

5dc985d

nielsbauman added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types :Data Management/Indices APIs DO NOT USE. Use ":Distributed/Indices APIs" or ":StorageEngine/Templates" instead. labels Nov 20, 2025

elasticsearchmachine added the v9.3.0 label Nov 20, 2025

Update docs/changelog/138367.yaml

0a7669e

Fix changelog

af1d19c

nielsbauman marked this pull request as ready for review November 20, 2025 14:59

elasticsearchmachine added Team:Data Management (obsolete) DO NOT USE. This team no longer exists. Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Nov 20, 2025

lukewhiting requested a review from Copilot November 20, 2025 15:09

Copilot AI reviewed Nov 20, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java Outdated Show resolved Hide resolved

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java Show resolved Hide resolved

lukewhiting reviewed Nov 20, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java Outdated Show resolved Hide resolved

nielsbauman mentioned this pull request Nov 24, 2025

Use existing DocumentMapper when creating new MapperService #138489

Merged

szybia reviewed Nov 24, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/cluster/metadata/MetadataMappingService.java Show resolved Hide resolved

dakrone reviewed Dec 4, 2025

View reviewed changes

spinscale reviewed Dec 10, 2025

View reviewed changes

spinscale approved these changes Dec 11, 2025

View reviewed changes

nielsbauman added 2 commits December 11, 2025 13:16

Merge branch 'main' into optimize-put-mapping

60fefc0

Add tests

aa6a984

spinscale reviewed Dec 11, 2025

View reviewed changes

nielsbauman added 2 commits December 11, 2025 14:45

Rename test

ff73549

Merge branch 'main' into optimize-put-mapping

feea3c5

nielsbauman enabled auto-merge (squash) December 11, 2025 13:47

nielsbauman disabled auto-merge December 11, 2025 14:53

nielsbauman added 2 commits December 16, 2025 11:06

Merge branch 'main' into optimize-put-mapping

353bec7

Apply Luke's suggestion

0b86a86

nielsbauman enabled auto-merge (squash) December 16, 2025 10:10

nielsbauman merged commit 59ef0c5 into elastic:main Dec 16, 2025
35 checks passed

nielsbauman deleted the optimize-put-mapping branch December 16, 2025 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve no-op check in PUT _mapping API#138367

Improve no-op check in PUT _mapping API#138367
nielsbauman merged 9 commits intoelastic:mainfrom
nielsbauman:optimize-put-mapping

nielsbauman commented Nov 20, 2025 •

edited

Loading

elasticsearchmachine commented Nov 20, 2025

nielsbauman commented Nov 20, 2025

elasticsearchmachine commented Nov 20, 2025

elasticsearchmachine commented Nov 20, 2025

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

DaveCTurner commented Nov 20, 2025

nielsbauman commented Nov 20, 2025

DaveCTurner commented Nov 20, 2025

szybia left a comment

Uh oh!

dakrone left a comment

spinscale Dec 10, 2025

nielsbauman Dec 10, 2025

spinscale Dec 11, 2025

spinscale Dec 10, 2025

nielsbauman Dec 10, 2025

spinscale left a comment

nielsbauman commented Dec 11, 2025

nielsbauman commented Dec 11, 2025

spinscale Dec 11, 2025

nielsbauman Dec 11, 2025

Uh oh!

Labels

8 participants

	MapperService mapperService = indicesService.createIndexMapperServiceForValidation(indexMetadata);
	indexMapperServices.put(index, mapperService);
	// add mappings for all types, we need them for cross-type validation
	mapperService.merge(indexMetadata, MergeReason.MAPPING_RECOVERY);

Conversation

nielsbauman commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented Nov 20, 2025

nielsbauman commented Nov 20, 2025

1) No-op updates

1) No-op updates - main

1) No-op updates - this PR

2) Actual updates

2) Actual updates - main

2) Actual updates - this PR

elasticsearchmachine commented Nov 20, 2025

elasticsearchmachine commented Nov 20, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

DaveCTurner commented Nov 20, 2025

nielsbauman commented Nov 20, 2025

DaveCTurner commented Nov 20, 2025

szybia left a comment

Choose a reason for hiding this comment

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

spinscale Dec 10, 2025

Choose a reason for hiding this comment

nielsbauman Dec 10, 2025

Choose a reason for hiding this comment

spinscale Dec 11, 2025

Choose a reason for hiding this comment

spinscale Dec 10, 2025

Choose a reason for hiding this comment

nielsbauman Dec 10, 2025

Choose a reason for hiding this comment

spinscale left a comment

Choose a reason for hiding this comment

nielsbauman commented Dec 11, 2025

nielsbauman commented Dec 11, 2025

spinscale Dec 11, 2025

Choose a reason for hiding this comment

nielsbauman Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

8 participants

nielsbauman commented Nov 20, 2025 •

edited

Loading

1) No-op updates - `main`

2) Actual updates - `main`