[ML] Add Telemetry for models without adaptive allocations by prwhelan · Pull Request #129161 · elastic/elasticsearch

prwhelan · 2025-06-09T19:02:32Z

Added min and max allocations as attributes to the telemetry for trained models with adaptive allocations enabled.

Added telemetry for models with adaptive allocations disabled or never set.

Verified on QA:

Added min and max allocations as attributes to the telemetry for trained models with adaptive allocations enabled. Added telemetry for models with adaptive allocations disabled or never set.

elasticsearchmachine · 2025-06-09T19:02:58Z

Hi @prwhelan, I've created a changelog YAML for you.

elasticsearchmachine · 2025-06-09T20:21:12Z

Pinging @elastic/ml-core (Team:ML)

davidkyle · 2025-06-12T12:33:05Z

x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/MlMetrics.java

+
+            trainedModelsCurrentAllocations += trainedModelAssignment.totalCurrentAllocations();
+            if (trainedModelAssignment.getAdaptiveAllocationsSettings() == null) {
+                trainedModelsFixedAllocations += trainedModelAssignment.totalCurrentAllocations();


Here and in line 518 the code is summing the number of allocations from all deployments that do not use adaptive allocations. A single deployment could have 10 allocations and we wouldn't know if the user has 10 deployments with 1 allocation or 1 deployment with 10.

I think counting the number of deployments would be more meaningful

Yeah that is a good point, we can just do an easy +1 to count the deployments

davidkyle · 2025-06-12T12:34:20Z

x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/MlMetrics.java

+                "es.ml.trained_models.deployment.fixed_allocations.current",
+                "Sum of current trained model allocations that do not use adaptive allocations (either enabled or disabled)",
+                "allocations",
+                () -> new LongWithAttributes(trainedModelAllocationCounts.trainedModelsFixedAllocations, isMasterMap)


Can the project type be added to the attribute map? If there are different rules for different project types it would be useful to split the data that way

I don't think so? It looks like it comes from serverless.project_type which isn't available here. We could move this metric to serverless, or we can use ES|QL magic to pull in the project type from other metrics via the project id.

It's possible this will get automatically added when running in serverless.

[ML] Add Telemetry for models without adaptive allocations

398a37e

Added min and max allocations as attributes to the telemetry for trained models with adaptive allocations enabled. Added telemetry for models with adaptive allocations disabled or never set.

prwhelan added >enhancement :ml Machine learning Team:ML Meta label for the ML team v9.1.0 labels Jun 9, 2025

Update docs/changelog/129161.yaml

fee6f83

prwhelan marked this pull request as ready for review June 9, 2025 20:20

Change to scales_to_zero to avoid high cardinality

d288c58

davidkyle reviewed Jun 12, 2025

View reviewed changes

prwhelan added 3 commits June 13, 2025 08:05

Count number of deployments

7dee3fe

Merge branch 'main' into metrics/maladaptive-allocations

aad532c

Merge branch 'main' into metrics/maladaptive-allocations

8362f48

jonathan-buttner approved these changes Jun 13, 2025

View reviewed changes

Merge branch 'main' into metrics/maladaptive-allocations

7015d66

prwhelan enabled auto-merge (squash) June 13, 2025 20:03

prwhelan merged commit b48f699 into elastic:main Jun 13, 2025
16 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Add Telemetry for models without adaptive allocations#129161

[ML] Add Telemetry for models without adaptive allocations#129161
prwhelan merged 7 commits intoelastic:mainfrom
prwhelan:metrics/maladaptive-allocations

prwhelan commented Jun 9, 2025 •

edited

Loading

elasticsearchmachine commented Jun 9, 2025

elasticsearchmachine commented Jun 9, 2025

davidkyle Jun 12, 2025

prwhelan Jun 13, 2025

davidkyle Jun 12, 2025

prwhelan Jun 13, 2025

Uh oh!

Labels

4 participants

Conversation

prwhelan commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented Jun 9, 2025

elasticsearchmachine commented Jun 9, 2025

davidkyle Jun 12, 2025

Choose a reason for hiding this comment

prwhelan Jun 13, 2025

Choose a reason for hiding this comment

davidkyle Jun 12, 2025

Choose a reason for hiding this comment

prwhelan Jun 13, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants

prwhelan commented Jun 9, 2025 •

edited

Loading