Add new sampling method to the Downsample API by gmarouli · Pull Request #136813 · elastic/elasticsearch

gmarouli · 2025-10-20T12:46:38Z

When downsampling gauge metrics, we create an aggregate object that contains the min, max, sum and value_count values that allow us to respond to the min, max, sum, count and average aggregations without losing any accuracy.

However, we recognise that for certain cases keeping just the last value might be good enough. For this reason, we add one more configuration option in the downsample API that allows a user to choose the sampling method between:

aggregate: when supported, this will downsample a metric by summarising its values in a aggregate document. This remains the default
last_value: this will keep the last value and discard the rest.

POST /my-time-series-index/_downsample/my-downsampled-time-series-index
{
  "fixed_interval": "1d",
  "sampling_method": "last_value"
}

This PR introduces the sampling method only in the downsampling API. There will be follow ups that will introduce it in the lifecycle management features.

Relates: #128357.

elasticsearchmachine · 2025-10-20T12:47:02Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

elasticsearchmachine · 2025-10-20T12:47:26Z

Hi @gmarouli, I've created a changelog YAML for you.

kkrik-es · 2025-10-21T07:09:19Z

server/src/main/java/org/elasticsearch/action/downsample/DownsampleConfig.java

+            if (method != null) {
+                return method;
+            }
+            boolean isIndexDownsampled = indexMetadata.getSettings().get(IndexMetadata.INDEX_DOWNSAMPLE_INTERVAL_KEY) != null;


If this is true, wouldn't fromString return a valid method above? Maybe I missed the logic here.

You are correct, the code as it is now will always have a sampling method. This is for backwards compatibility. Indices before this PR will not have sampling method defined in their metadata and we know that only the aggregate was available, so this is why we specify it here. Even if we ever change the default, this shouldn't change. I will add a comment so it is clear.

Should we be just returning AGGREGATE then, without checking the setting again?

I did consider this as well to avoid this check. If we do that we move the responsibility to the caller, it's possible that the caller has already checked that so this is a redundant check. On the other hand, I do think that it is part of this methods responsibility to return null when the index is not downsampled.

If it was on the critical path, I would consider removing the check and adding a warning in the javadoc of this method. However, considering that this is called mainly by the downsample API and by the data stream lifecycle in the future, I think it's a performance penalty we can accept. What do you think?

server/src/main/java/org/elasticsearch/action/downsample/DownsampleConfig.java

...downsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/DownsampleIT.java

kkrik-es

Nice, just a few nits.

…ticsearch/xpack/downsample/DownsampleIT.java Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>

martijnvg

Looks good @gmarouli!

There are also many downsample tests in DownsampleActionSingleNodeTests. Maybe also randomly use the new sample method there too?

gmarouli · 2025-10-21T12:14:12Z

Thanks @martijnvg , apart from adjusting these tests, I ensured that all tests apart from (ILM, DLM & telemetry) are also using random sampling.

martijnvg

Thanks for adjusting the tests! LGTM

Following #136813, we expose to ILM the new sampling method config in the downsampling API. This will allow users to configure the sampling method in their downsample action of their ILM policies. For example: ``` PUT _ilm/policy/datastream_policy { "policy": { "phases": { "hot": { "actions": { "rollover": { "max_docs": 1 }, "downsample": { "fixed_interval": "1h", "force_merge_index": false, "sampling_method": "aggregate" } } } } } } ```

When downsampling gauge metrics, we create an aggregate object that contains the min, max, sum and value_count values that allow us to respond to the min, max, sum, count and average aggregations without losing any accuracy. However, we recognise that for certain cases keeping just the last value might be good enough. For this reason, we add one more configuration option in the downsample API that allows a user to choose the sampling method between: aggregate: when supported, this will downsample a metric by summarising its values in a aggregate document. This remains the default last_value: this will keep the last value and discard the rest. ``` POST /my-time-series-index/_downsample/my-downsampled-time-series-index { "fixed_interval": "1d", "sampling_method": "last_value" } ``` This PR introduces the sampling method only in the downsampling API. There will be follow ups that will introduce it in the lifecycle management features. Relates: elastic#128357.

…37023) Following #136813, we expose to data stream lifecycle the new sampling method config in the downsampling API. This will allow users to configure the sampling method directly in the lifecycle configuration. For example: ``` PUT _data_stream/my-ds/_lifecycle { "data_retention": "10d", "downsampling_method": "last_value", "downsampling": [ { "after": "1d", "fixed_interval": "5m } ] } ```

gmarouli added 3 commits October 20, 2025 10:35

Introduce sampling method in the downsampling configuration.

364d1e1

Add sampling method in the index metadata when downsampling

fad96cb

Apply the requested sampling method when downsampling

5209a92

gmarouli requested a review from a team as a code owner October 20, 2025 12:46

gmarouli added >enhancement :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data labels Oct 20, 2025

elasticsearchmachine added Team:StorageEngine v9.3.0 labels Oct 20, 2025

Update docs/changelog/136813.yaml

43427c3

This was referenced Oct 20, 2025

[Downsampling] Add a new downsampling mode keeping a single sample value #128357

Closed

Add new sampling method to the downsample API elastic/elasticsearch-specification#5496

Closed

gmarouli added 2 commits October 20, 2025 16:15

Use effective downsampling method instead of the requested

207f443

Remove unnecessary assertion

fa18190

gmarouli requested review from kkrik-es and martijnvg October 20, 2025 17:10

Merge branch 'main' into downsampling/add-new-sampling-method-api-only

e875fb7

kkrik-es reviewed Oct 21, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/action/downsample/DownsampleConfig.java Outdated Show resolved Hide resolved

kkrik-es reviewed Oct 21, 2025

View reviewed changes

...downsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/DownsampleIT.java Outdated Show resolved Hide resolved

kkrik-es approved these changes Oct 21, 2025

View reviewed changes

gmarouli and others added 2 commits October 21, 2025 10:42

Update x-pack/plugin/downsample/src/internalClusterTest/java/org/elas…

68424c2

…ticsearch/xpack/downsample/DownsampleIT.java Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>

Small fixes from review

45fddce

martijnvg reviewed Oct 21, 2025

View reviewed changes

gmarouli added 2 commits October 21, 2025 13:56

Rename getOrDefault in the config

fb16362

Ensure all downsampling api tests are using random sampling method

6149b93

martijnvg approved these changes Oct 21, 2025

View reviewed changes

merge with main

9ef44bb

gmarouli merged commit 6809a35 into elastic:main Oct 22, 2025
34 checks passed

gmarouli deleted the downsampling/add-new-sampling-method-api-only branch October 22, 2025 08:04

This was referenced Oct 22, 2025

Support different downsampling methods through ILM #136951

Merged

Support choosing the downsampling method in data stream lifecycle #137023

Merged

Add new sampling method to the downsample API elastic/elasticsearch-specification#5563

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new sampling method to the Downsample API#136813

Add new sampling method to the Downsample API#136813
gmarouli merged 12 commits intoelastic:mainfrom
gmarouli:downsampling/add-new-sampling-method-api-only

gmarouli commented Oct 20, 2025

elasticsearchmachine commented Oct 20, 2025

elasticsearchmachine commented Oct 20, 2025

kkrik-es Oct 21, 2025

gmarouli Oct 21, 2025

kkrik-es Oct 21, 2025

gmarouli Oct 21, 2025

Uh oh!

Uh oh!

kkrik-es left a comment

martijnvg left a comment

gmarouli commented Oct 21, 2025

martijnvg left a comment

Uh oh!

Labels

4 participants

Conversation

gmarouli commented Oct 20, 2025

elasticsearchmachine commented Oct 20, 2025

elasticsearchmachine commented Oct 20, 2025

kkrik-es Oct 21, 2025

Choose a reason for hiding this comment

gmarouli Oct 21, 2025

Choose a reason for hiding this comment

kkrik-es Oct 21, 2025

Choose a reason for hiding this comment

gmarouli Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kkrik-es left a comment

Choose a reason for hiding this comment

martijnvg left a comment

Choose a reason for hiding this comment

gmarouli commented Oct 21, 2025

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants