Enhance memory accounting for document expansion and introduce max document size limit by fcofdez · Pull Request #123543 · elastic/elasticsearch

fcofdez · 2025-02-26T19:48:31Z

This commit improves memory accounting by incorporating document
expansion during shard bulk execution. Additionally, it introduces a new
limit on the maximum document size, which defaults to 5% of the
available heap.

This limit can be configured using the new setting:

indexing_pressure.memory.max_operation_size

These changes help prevent excessive memory consumption and
improve indexing stability.

Closes ES-10777

…size limit

elasticsearchmachine · 2025-02-26T19:48:54Z

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

elasticsearchmachine · 2025-02-26T19:48:55Z

Hi @fcofdez, I've created a changelog YAML for you.

fcofdez · 2025-02-27T07:35:25Z

server/src/main/java/org/elasticsearch/index/IndexingPressure.java

    private final AtomicLong lowWaterMarkSplits = new AtomicLong(0);
    private final AtomicLong highWaterMarkSplits = new AtomicLong(0);

+    private final AtomicLong largeOpsRejections = new AtomicLong(0);


My idea is to use this specific metrics related to large documents / expansion issues in autoscaling, that way we could have an idea of the extra memory that the nodes would need to cope with the load.

…ting

…csearch into improve-memory-accounting

github-actions · 2025-02-27T08:11:17Z

Warning

It looks like this PR modifies one or more .asciidoc files. These files are being migrated to Markdown, and any changes merged now will be lost. See the migration guide for details.

github-actions · 2025-02-27T10:15:50Z

Warning

It looks like this PR modifies one or more .asciidoc files. These files are being migrated to Markdown, and any changes merged now will be lost. See the migration guide for details.

…ting

henningandersen · 2025-03-03T13:20:54Z

server/src/main/java/org/elasticsearch/index/IndexingPressure.java


+    public static final Setting<ByteSizeValue> MAX_OPERATION_SIZE = Setting.memorySizeSetting(
+        "indexing_pressure.memory.max_operation_size",
+        "5%",


This seems slightly breaking for small heaps, like 1GB heap now only allow 50MB docs. While this may be a good default, it could break some users too. Should we make it 10% here in the core code base for now?

Tim-Brooks

This mostly looks good. I do have a question about accounting operation counts twice.

Tim-Brooks · 2025-03-03T21:05:37Z

server/src/main/java/org/elasticsearch/index/stats/IndexingPressureStats.java

    private final long lowWaterMarkSplits;
    private final long highWaterMarkSplits;
+    private final long largeOpsRejections;
+    private final long totalLargeRejectedOpsBytes;


We haven't tracked the number of byte rejected in the past. I'm guessing that maybe we think this could be useful for autoscaling?

Like divide bytes rejected / ops to get an idea of what we need to scale to?

Yes, that's the idea. But it's a bit tricky because we should make that value sticky somehow.

Tim-Brooks · 2025-03-03T21:06:51Z

server/src/main/java/org/elasticsearch/index/IndexingPressure.java

    }

-    public Releasable markPrimaryOperationStarted(int operations, long bytes, boolean forceExecution) {
+    public void checkLargestPrimaryOperationIsWithinLimits(


Any reason this should not be called from within markPrimaryOperationLocalToCoordinatingNodeStarted similar to validateAndMarkPrimaryOperationStarted?

Not really, I added that method in f8b8814

Tim-Brooks · 2025-03-03T21:11:29Z

server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java

    ) {
+        var listener = ActionListener.releaseBefore(
+            indexingPressure.trackPrimaryOperationExpansion(
+                primaryOperationCount(request),


Isn't think incrementing primary operations count a second time? I think we just want to do expansion memory here?

We have to pass the number of operations to increment the number of rejected operations if the expansion is beyond the current limit. (there's an if to only increment the rejections in the method)

Tim-Brooks · 2025-03-03T21:15:21Z

server/src/main/java/org/elasticsearch/index/IndexingPressure.java

    private final AtomicLong lowWaterMarkSplits = new AtomicLong(0);
    private final AtomicLong highWaterMarkSplits = new AtomicLong(0);

+    private final AtomicLong largeOpsRejections = new AtomicLong(0);


Tim-Brooks · 2025-03-03T21:19:31Z

server/src/main/java/org/elasticsearch/index/IndexingPressure.java

            this.currentCombinedCoordinatingAndPrimaryBytes.getAndAdd(-bytes);
            this.primaryRejections.getAndIncrement();
            this.primaryDocumentRejections.addAndGet(operations);
+            if (operationExpansionTracking) {


Hmm. Does this make sense? We've already pass the "large document" check. And this just means we passed the primary limit when attempting to do the expansion.

I wasn't sure if we wanted to track expansions for autoscaling too, if you think that we shouldn't I can remove this.

I kind of feel like if we get to this point and it is not a "large document" this should count as a primary rejection only as it might be just a normal sized document.

However, I'm open to changing this if there is a logical reason from an autoscaling perspective why we want to treat this.

…ting

Tim-Brooks

This looks good to me.

I made a comment about the large document vs. primary rejection. Although, I'm okay with either approach.

…ting

…cument size limit (elastic#123543) This commit improves memory accounting by incorporating document expansion during shard bulk execution. Additionally, it introduces a new limit on the maximum document size, which defaults to 5% of the available heap. This limit can be configured using the new setting: indexing_pressure.memory.max_operation_size These changes help prevent excessive memory consumption and improve indexing stability. Closes ES-10777

Add memory accounting for document expansion and impose max document …

b785bd8

…size limit

fcofdez added >enhancement :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. Team:Distributed Indexing (obsolete) Meta label for Distributed Indexing team. Obsolete. Please do not use. v9.1.0 labels Feb 26, 2025

fcofdez requested a review from Tim-Brooks February 26, 2025 19:48

fcofdez requested a review from a team as a code owner February 26, 2025 19:48

Update docs/changelog/123543.yaml

56332c9

fcofdez commented Feb 27, 2025

View reviewed changes

fcofdez added 3 commits February 27, 2025 09:03

Fix stats tests

e0b3e0c

Merge remote-tracking branch 'origin/main' into improve-memory-accoun…

6a85f4e

…ting

Merge branch 'improve-memory-accounting' of github.com:fcofdez/elasti…

94a8870

…csearch into improve-memory-accounting

Release before

f62c5d0

fcofdez added 3 commits February 27, 2025 14:37

Merge remote-tracking branch 'origin/main' into improve-memory-accoun…

9df15cf

…ting

Merge remote-tracking branch 'origin/main' into improve-memory-accoun…

474e29f

…ting

Use releaseBefore

2c3d38b

henningandersen reviewed Mar 3, 2025

View reviewed changes

Tim-Brooks reviewed Mar 3, 2025

View reviewed changes

fcofdez added 3 commits March 4, 2025 16:46

Review comments

f8b8814

Merge remote-tracking branch 'origin/main' into improve-memory-accoun…

8f1d500

…ting

Increase default limit

21d51bd

fcofdez requested review from Tim-Brooks and henningandersen March 4, 2025 17:51

Tim-Brooks approved these changes Mar 5, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into improve-memory-accoun…

c8a3d74

…ting

Do not account for expansions

18c0779

fcofdez merged commit 387eef0 into elastic:main Mar 6, 2025
17 checks passed

albertzaharovits mentioned this pull request Mar 17, 2025

Indexing Pressure should also account for the parsing memory #121651

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance memory accounting for document expansion and introduce max document size limit#123543

Enhance memory accounting for document expansion and introduce max document size limit#123543
fcofdez merged 14 commits intoelastic:mainfrom
fcofdez:improve-memory-accounting

fcofdez commented Feb 26, 2025

elasticsearchmachine commented Feb 26, 2025

elasticsearchmachine commented Feb 26, 2025

fcofdez Feb 27, 2025

Tim-Brooks Mar 3, 2025

github-actions bot commented Feb 27, 2025

github-actions bot commented Feb 27, 2025

henningandersen Mar 3, 2025

Tim-Brooks left a comment

Tim-Brooks Mar 3, 2025

fcofdez Mar 4, 2025

Tim-Brooks Mar 3, 2025

fcofdez Mar 4, 2025

Tim-Brooks Mar 3, 2025

fcofdez Mar 4, 2025

Tim-Brooks Mar 3, 2025

Tim-Brooks Mar 3, 2025

fcofdez Mar 4, 2025

Tim-Brooks Mar 5, 2025

Tim-Brooks left a comment

Uh oh!

Labels

4 participants

Conversation

fcofdez commented Feb 26, 2025

elasticsearchmachine commented Feb 26, 2025

elasticsearchmachine commented Feb 26, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Feb 27, 2025

github-actions bot commented Feb 27, 2025

Choose a reason for hiding this comment

Tim-Brooks left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tim-Brooks left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants