Prevent auto-sharding for data streams in LOOKUP index mode by lukewhiting · Pull Request #131429 · elastic/elasticsearch

lukewhiting · 2025-07-17T10:41:40Z

This PR disabled the datastream autosharding for LOOKUP indices to prevent them scaling above 1 replica which is unsupported by the lookup mappers.

Fixes ES-12330

elasticsearchmachine · 2025-07-17T10:42:10Z

Hi @lukewhiting, I've created a changelog YAML for you.

Copilot

Pull Request Overview

This PR prevents auto-sharding functionality for data streams that use LOOKUP index mode, as lookup mappers don't support scaling beyond 1 replica. The implementation adds an early return in the auto-sharding calculation logic when the index mode is LOOKUP.

Adds a check in the auto-sharding service to return NOT_APPLICABLE_RESULT for LOOKUP index mode data streams
Includes comprehensive test coverage for both scenarios with and without index statistics

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
DataStreamAutoShardingService.java	Adds LOOKUP index mode check to prevent auto-sharding calculation
DataStreamAutoShardingServiceTests.java	Adds test cases to verify auto-sharding is disabled for LOOKUP index mode

Comments suppressed due to low confidence (2)

server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java:1347

[nitpick] The test method name could be more descriptive. Consider renaming to 'testCalculateReturnsNotApplicableForLookupIndexModeWithStats' to better distinguish it from the null stats test case.

    public void testCalculateReturnsNotApplicableForLookupIndexMode() {

...va/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java

elasticsearchmachine · 2025-07-25T09:38:18Z

Pinging @elastic/es-data-management (Team:Data Management)

szybia · 2025-07-28T16:38:05Z

...in/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java

            return NOT_APPLICABLE_RESULT;
        }

+        if (dataStream.getIndexMode() == IndexMode.LOOKUP) {


change lgtm!

if you wouldn't mind me asking a few (hopefully not dumb) Qs just to learn, just if you know them off the top of your head, i can investigate anything you might not know

terminology: it seems in the code and slack discussion regarding the ticket, it seems we use the term shard to only mean write shard, but we also have replica/read shards right? just wondering whether whenever i see the term shard should i basically assume this means write shard/primary

so LOOKUP is a type of index, that only only ever has one primary shard, but can it have more than 1 replica/read shard? assuming it can, do we have any replica auto-sharding in place in data streams if the read load gets too heavy?

why does LOOKUP only ever allow one primary shard? it's always possible it has heavy writes (hence the error for scaleup i'm assuming), is it basically just a mode where you're assuming up-front write volume is low and it's -- as in the name -- just a lookup

for this auto-sharding result logic, if we scale up, do we roll-over to a new index with more primaries or split? and if rollover the new index has more primaries right

for the ticket, did the lookup index just completely fail to rollover and kept accumulating data, or it just threw the validation exceptions as this kept being called to scale-up but eventually was rolled over due to size/time?

@szybia We can chat about this in our 1:1 tomorrow if you like.

yes please! might take the load off luke...

dakrone

LGTM also

elasticsearchmachine · 2025-07-29T12:26:07Z

💔 Backport failed

Status	Branch	Result
❌	9.0	Commit could not be cherrypicked due to conflicts
✅	9.1
❌	8.19	Commit could not be cherrypicked due to conflicts
❌	8.18	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 131429

…131429) * Prevent auto-sharding for data streams in LOOKUP index mode * Update docs/changelog/131429.yaml * Reduce test duplication

…131429) * Prevent auto-sharding for data streams in LOOKUP index mode * Update docs/changelog/131429.yaml * Reduce test duplication (cherry picked from commit ea22dff) # Conflicts: # server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java # server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java

lukewhiting · 2025-07-29T13:05:48Z

💚 All backports created successfully

Status	Branch	Result
✅	9.0
✅	8.19
✅	8.18

Questions ?

Please refer to the Backport tool documentation

…131429) * Prevent auto-sharding for data streams in LOOKUP index mode * Update docs/changelog/131429.yaml * Reduce test duplication (cherry picked from commit ea22dff) # Conflicts: # server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java # server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java

…#132073) * Prevent auto-sharding for data streams in LOOKUP index mode * Update docs/changelog/131429.yaml * Reduce test duplication

…#132079) * Prevent auto-sharding for data streams in LOOKUP index mode * Update docs/changelog/131429.yaml * Reduce test duplication (cherry picked from commit ea22dff) # Conflicts: # server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java # server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java

…#132082) * Prevent auto-sharding for data streams in LOOKUP index mode * Update docs/changelog/131429.yaml * Reduce test duplication (cherry picked from commit ea22dff) # Conflicts: # server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java # server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java

…#132080) * Prevent auto-sharding for data streams in LOOKUP index mode * Update docs/changelog/131429.yaml * Reduce test duplication (cherry picked from commit ea22dff) # Conflicts: # server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java # server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java

Prevent auto-sharding for data streams in LOOKUP index mode

d3acd1e

lukewhiting requested review from PeteGillinElastic and Copilot July 17, 2025 10:41

lukewhiting added >bug :StorageEngine/Data streams Data streams and their lifecycles auto-backport Automatically create backport pull requests when merged v9.0.0 v9.1.0 v9.2.0 v8.19.1 v8.18.5 labels Jul 17, 2025

lukewhiting removed the request for review from PeteGillinElastic July 17, 2025 10:41

Update docs/changelog/131429.yaml

d580af0

Copilot AI reviewed Jul 17, 2025

View reviewed changes

...va/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java Show resolved Hide resolved

lukewhiting added 3 commits July 17, 2025 13:31

Merge branch 'main' into ES-12330-prevent-auto-shard-on-lookup-index

d089204

Reduce test duplication

0a3bc83

Merge branch 'main' into ES-12330-prevent-auto-shard-on-lookup-index

50805b1

lukewhiting marked this pull request as ready for review July 25, 2025 09:37

elasticsearchmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label Jul 25, 2025

Merge branch 'main' into ES-12330-prevent-auto-shard-on-lookup-index

eeeeb50

szybia approved these changes Jul 28, 2025

View reviewed changes

dakrone approved these changes Jul 28, 2025

View reviewed changes

lukewhiting merged commit ea22dff into elastic:main Jul 29, 2025
33 checks passed

lukewhiting mentioned this pull request Jul 29, 2025

[9.1] Prevent auto-sharding for data streams in LOOKUP index mode (#131429) #132073

Merged

elasticsearchmachine added the backport pending label Jul 29, 2025

lukewhiting mentioned this pull request Jul 29, 2025

[9.0] Prevent auto-sharding for data streams in LOOKUP index mode (#131429) #132079

Merged

lukewhiting mentioned this pull request Jul 29, 2025

[8.19] Prevent auto-sharding for data streams in LOOKUP index mode (#131429) #132080

Merged

lukewhiting mentioned this pull request Jul 29, 2025

[8.18] Prevent auto-sharding for data streams in LOOKUP index mode (#131429) #132082

Merged

lukewhiting deleted the ES-12330-prevent-auto-shard-on-lookup-index branch July 29, 2025 14:32

masseyke mentioned this pull request Sep 10, 2025

DataStreamAutoShardingService.calculate() works on stale data #134505

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent auto-sharding for data streams in LOOKUP index mode#131429

Prevent auto-sharding for data streams in LOOKUP index mode#131429
lukewhiting merged 6 commits intoelastic:mainfrom
lukewhiting:ES-12330-prevent-auto-shard-on-lookup-index

lukewhiting commented Jul 17, 2025

elasticsearchmachine commented Jul 17, 2025

Copilot AI left a comment

Uh oh!

elasticsearchmachine commented Jul 25, 2025

szybia Jul 28, 2025

PeteGillinElastic Jul 28, 2025

szybia Jul 28, 2025

dakrone left a comment

Uh oh!

elasticsearchmachine commented Jul 29, 2025

lukewhiting commented Jul 29, 2025

Labels

6 participants

Conversation

lukewhiting commented Jul 17, 2025

elasticsearchmachine commented Jul 17, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

elasticsearchmachine commented Jul 25, 2025

szybia Jul 28, 2025

Choose a reason for hiding this comment

PeteGillinElastic Jul 28, 2025

Choose a reason for hiding this comment

szybia Jul 28, 2025

Choose a reason for hiding this comment

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Jul 29, 2025

💔 Backport failed

lukewhiting commented Jul 29, 2025

💚 All backports created successfully

Questions ?

Labels

6 participants