allocation: add duration and count metrics for write load hotspot by schase-es · Pull Request #138465 · elastic/elasticsearch

schase-es · 2025-11-24T01:29:05Z

The WriteLoadConstraintMonitor now reports as APM metrics its current count of hotspot nodes, and the duration of each hotspot. When a node is reported as hotspotting, this start time is stored in a table along with its node id. When it stops, the time difference between then and now is recorded as a duration metric. The size of this table is the current hotspot count.

Closes: ES-13381

The WriteLoadConstraintMonitor now reports as APM metrics its current count of hotspot nodes, and the duration of each hotspot. When a node is reported as hotspotting, this start time is stored along with its node id. When it stops, the time difference between then and now is recorded as a duration metric. Closes: ES-13381

elasticsearchmachine · 2025-11-24T01:29:29Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

elasticsearchmachine · 2025-11-24T01:29:30Z

Hi @schase-es, I've created a changelog YAML for you.

nicktindall · 2025-11-24T03:48:51Z

...r/src/main/java/org/elasticsearch/cluster/routing/allocation/WriteLoadConstraintMonitor.java


+    private List<LongWithAttributes> getHotspotNodesCount() {
+        return List.of(new LongWithAttributes(hotspotNodesCount));
+    }


We want to return an empty list here if we're not the master, otherwise we'll get a bunch of bogus values each metric interval from the non-master nodes.

Perhaps getHotSpotNodesCount could switch it to -1 and then we could only report if it's been switched back to an actual value (the InternalClusterInfoService stops refreshing the cluster info when the node is no longer master so this approach would naturally only report metrics from the master).

Or any other approach you think is best.

I don't think we need to make this a cluster state listener, we will stop receiving new ClusterInfo when we are no longer master.

If we make this listen to the cluster state, it adds more dependencies and will add a tiny additional cost to the cluster state applier thread (negligible but we should try and do that as little as possible).

I think I prefer my suggestion to only return a hot-spot node count if we've calculated one since last time we were polled.

That would also be presumably simpler to test (just confirm we only return the count if onNewInfo has been called since last time we asked)

nicktindall · 2025-11-24T03:54:41Z

.../test/java/org/elasticsearch/cluster/routing/allocation/WriteLoadConstraintMonitorTests.java

+        );
+
+        final Set<String> firstWaveHotspotNodes = testState.hotspotNodeIds();
+        final String removeHotspotId = randomSubsetOf(1, testState.hotspotNodeIds()).get(0);


Nit: I think you can use randomFrom(Collection<T>)

nicktindall · 2025-11-24T04:16:27Z

.../test/java/org/elasticsearch/cluster/routing/allocation/WriteLoadConstraintMonitorTests.java

+
+        // check the test state hotspots are recorded in the counter
+        recordingMeterRegistry.getRecorder().collect();
+        assertMetricsCollected(recordingMeterRegistry, List.of((long) testState.hotspotNodeIds().size()), List.of());


I kinda think we shouldn't add the assertions around metrics to every test, just to keep those tests focused on the specific aspects that they're testing? I'd be happy to keep it all for the dedicated test that you have at the bottom.

nicktindall

looking good with a few comments

nicktindall

LGTM with some nits

nicktindall · 2025-12-04T03:44:04Z

...r/src/main/java/org/elasticsearch/cluster/routing/allocation/WriteLoadConstraintMonitor.java

+            hotspotDurationHistogram.record(hotspotDuration / 1000.0);
+        }
+        hotspotNodesCount = hotspotNodeStartTimes.size();
+        hotspotNodesCountUpdatedSinceLastRead = true;


I wonder if it'd be feasible to break this stuff out into a method, just so it doesn't distract too much from the monitoring logic?

Likewise with the recording of the start times?

nicktindall · 2025-12-04T03:45:14Z

...r/src/main/java/org/elasticsearch/cluster/routing/allocation/WriteLoadConstraintMonitor.java

+        if (state.term() != hotspotNodeStartTimesLastTerm) {
+            hotspotNodeStartTimesLastTerm = state.term();
+            hotspotNodeStartTimes.clear();
+        }


Perhaps we can also clear the state if e.g. state. isLocalNodeElectedMaster() == false, this can happen before the term is bumped.

nicktindall · 2025-12-04T03:46:15Z

...r/src/main/java/org/elasticsearch/cluster/routing/allocation/WriteLoadConstraintMonitor.java


-    public WriteLoadConstraintMonitor(
+    public static final String HOTSPOT_NODES_COUNT_METRIC_NAME = "es.allocator.allocations.node.write_load_hotspot.current";
+    public static final String HOTSPOT_DURATION_METRIC_NAME = "es.allocator.allocations.node.write_load_hotspot.duration.histogram";


Nit: I think we usually put constants above the fields

nicktindall · 2025-12-04T03:53:30Z

...r/src/main/java/org/elasticsearch/cluster/routing/allocation/WriteLoadConstraintMonitor.java

+    public static final String HOTSPOT_DURATION_METRIC_NAME = "es.allocator.allocations.node.write_load_hotspot.duration.histogram";
+
+    private volatile long hotspotNodesCount = 0; // metrics source of hotspotting node count
+    private volatile boolean hotspotNodesCountUpdatedSinceLastRead = false; // turns off metrics when not master/onNewInfo isn't called


Could we maybe achieve the same with a single atomic field instead e.g.

writer:

hotSpotNodesCount.set(count);

reader:

int count = hotSpotNodesCount.getAndSet(-1) if (count >= 0) { return List.of(LongWithAttributes(count, ...); } else { return List.of(); }

or is there an advantage to having the two fields?

schase-es requested review from DiannaHohensee and nicktindall November 24, 2025 01:29

schase-es added the >enhancement label Nov 24, 2025

schase-es requested a review from a team as a code owner November 24, 2025 01:29

schase-es added :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v9.3.0 labels Nov 24, 2025

elasticsearchmachine added the Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. label Nov 24, 2025

Update docs/changelog/138465.yaml

75119f5

nicktindall reviewed Nov 24, 2025

View reviewed changes

schase-es and others added 6 commits November 24, 2025 17:54

Reverting tests

0e86fd5

Enable metrics sending when master, added separate tests

aca449c

Changed metrics turn-off facility to turn off when new data stops coming

1c31b0e

Merge branch 'main' into ES-13381_write-load-monitor-hotspot-metrics

913acb8

Clear out hotspot duration table when the cluster state term changes

d1462fc

[CI] Auto commit changes from spotless

2e0e4c2

nicktindall approved these changes Dec 4, 2025

View reviewed changes

schase-es added 6 commits December 4, 2025 17:53

Moving static constants

e415267

Changing out count/updated duo for single field

0cebca4

Move of start times recording into method

e4e1422

Moving duration recording into separate method

586ae69

Merge branch 'main' into ES-13381_write-load-monitor-hotspot-metrics

600dd97

Make master and local node the same for write constraint monitor tests

58dd7d1

schase-es merged commit 20d144c into elastic:main Dec 5, 2025
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allocation: add duration and count metrics for write load hotspot#138465

allocation: add duration and count metrics for write load hotspot#138465
schase-es merged 14 commits intoelastic:mainfrom
schase-es:ES-13381_write-load-monitor-hotspot-metrics

schase-es commented Nov 24, 2025

elasticsearchmachine commented Nov 24, 2025

elasticsearchmachine commented Nov 24, 2025

nicktindall Nov 24, 2025

nicktindall Nov 26, 2025

nicktindall Nov 26, 2025

nicktindall Nov 24, 2025 •

edited

Loading

nicktindall Nov 24, 2025

nicktindall left a comment

nicktindall left a comment

nicktindall Dec 4, 2025

nicktindall Dec 4, 2025

nicktindall Dec 4, 2025

nicktindall Dec 4, 2025

Uh oh!

Labels

3 participants

Conversation

schase-es commented Nov 24, 2025

elasticsearchmachine commented Nov 24, 2025

elasticsearchmachine commented Nov 24, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicktindall Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicktindall left a comment

Choose a reason for hiding this comment

nicktindall left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants

nicktindall Nov 24, 2025 •

edited

Loading