Skip to content

Use existing DocumentMapper when creating new MapperService#138489

Merged
nielsbauman merged 7 commits intoelastic:mainfrom
nielsbauman:mapper-service-copy
Dec 10, 2025
Merged

Use existing DocumentMapper when creating new MapperService#138489
nielsbauman merged 7 commits intoelastic:mainfrom
nielsbauman:mapper-service-copy

Conversation

@nielsbauman
Copy link
Contributor

When we create a new MapperService from an existing index for validation purposes, we often merge the mappings of the existing index into the temporary mapper service right after. This requires parsing the mappings and "merging" them (there's not much merging to do with an empty service, but we still need to traverse the mappings object). This can take up considerable time for larger mappings.

By passing the existing DocumentMapper instance of the index (if available) to the new MapperService, the aforementioned mapping merge becomes a no-op and saves all that redundant processing.

When we create a new `MapperService` from an existing index for
validation purposes, we often merge the mappings of the existing index
into the temporary mapper service right after. This requires parsing the
mappings and "merging" them (there's not much merging to do with an
empty service, but we still need to traverse the mappings object). This
can take up considerable time for larger mappings.

By passing the existing `DocumentMapper` instance of the index (if
available) to the new `MapperService`, the aforementioned mapping merge
becomes a no-op and saves all that redundant processing.
@nielsbauman nielsbauman added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types labels Nov 24, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @nielsbauman, I've created a changelog YAML for you.

@nielsbauman
Copy link
Contributor Author

These changes are related to #138367, but can be merged/viewed separately. I performed similar benchmarks for this PR as well. See #138367 (comment) for a brief explanation of what experiments I performed.

Here is the same flamegraph on main (the first flamegraph in the comment I linked):
image
flamegraph-mappings-main-noop-pr.html

And here is the flamegraph on this PR:
image
flamegraph-mappings-only-mapper-copy-noop.html

The rightmost merge/parse path completely disappeared, saving a lot of processing.

@nielsbauman nielsbauman marked this pull request as ready for review November 24, 2025 11:15
@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Nov 24, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

.map(IndexService::mapperService)
.map(MapperService::documentMapper)
.orElse(null);
return indexModule.newIndexMapperService(clusterService, parserConfig, mapperRegistry, scriptService, documentMapper);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking how this could be tested, I played around a little but my IDE seems not to like me too much today. Would a test like this (in IndicesServiceTests make sense to you?

    public void testCheckIndicesServiceMapper() throws IOException {
        IndicesService indicesService = getIndicesService();

        IndexMetadata indexMetadata = new IndexMetadata.Builder("test").settings(
            indexSettings(1, 1).put("index.version.created", IndexVersions.V_8_10_0)
                .put(SETTING_INDEX_UUID, randomUUID())
        ).build();
        IndexService indexService = indicesService.createIndex(indexMetadata, List.of(), true);

        IndexMetadata newIndexMetadata = IndexMetadata.builder(indexMetadata)
            .numberOfReplicas(2)
            .mappingVersion(indexMetadata.getMappingVersion()+1)
            .putMapping("""
            {
              "_doc":{
                "properties": {
                  "@timestamp": {
                    "type": "date"
                  }
                }
              }
            }""")
            .build();
        indexService.updateMapping(indexMetadata, newIndexMetadata);

        DocumentMapper documentMapper = indicesService.createIndexMapperServiceForValidation(indexMetadata).documentMapper();
        DocumentMapper cachedDocumentMapper = indicesService.createIndexMapperServiceForValidation(indexMetadata).documentMapper();

        assertNotNull(documentMapper);
        assertSame(documentMapper, cachedDocumentMapper);
    }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I pushed b91c6a6 with some minor changes to your suggestion. Let me know if that works for you.

Copy link
Contributor

@spinscale spinscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left one minor nit

@nielsbauman nielsbauman enabled auto-merge (squash) December 10, 2025 10:11
@nielsbauman
Copy link
Contributor Author

Thanks a lot for the review, @spinscale! This will help a great deal in reducing the performance impact of mapping updates.

@nielsbauman nielsbauman disabled auto-merge December 10, 2025 14:36
@nielsbauman nielsbauman merged commit 160b021 into elastic:main Dec 10, 2025
34 of 35 checks passed
@nielsbauman nielsbauman deleted the mapper-service-copy branch December 10, 2025 14:37
nielsbauman added a commit to nielsbauman/elasticsearch that referenced this pull request Dec 15, 2025
We observed a race condition - introduced in elastic#138489 - for usages of
`IndicesService#createIndexMapperServiceForValidation` between
`ClusterService#state()` and the `DocumentMapper` in an index'
`MapperService`, as both are updated non-atomically. This race condition
manifested in the `PUT _mapping` API in the form of mapping conflicts
because we merge the mapping of the `IndexMetadata` (retrieved from the
`ClusterService`) into the temporary `MapperService` for validation
(which uses the mappings from the existing `MapperService`).

Since reusing the `DocumentMapper` is merely an optimization, we can
simply skip it in these rare situations where the race condition
manifests.
nielsbauman added a commit that referenced this pull request Dec 16, 2025
We observed a race condition - introduced in #138489 - for usages of
`IndicesService#createIndexMapperServiceForValidation` between
`ClusterService#state()` and the `DocumentMapper` in an index'
`MapperService`, as both are updated non-atomically. This race condition
manifested in the `PUT _mapping` API in the form of mapping conflicts
because we merge the mapping of the `IndexMetadata` (retrieved from the
`ClusterService`) into the temporary `MapperService` for validation
(which uses the mappings from the existing `MapperService`).

Since reusing the `DocumentMapper` is merely an optimization, we can
simply skip it in these rare situations where the race condition
manifests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.3.0

3 participants