Can Match and DFS search phase shard search APM metrics by chrisparrinello · Pull Request #135285 · elastic/elasticsearch

chrisparrinello · 2025-09-23T14:45:43Z

Implements the following APM metrics to record the per-shard duration of these search phases:

es.search.shards.phases.can_match.duration.histogram
es.search.shards.phases.dfs.duration.histogram

elasticsearchmachine · 2025-09-23T16:52:46Z

Pinging @elastic/es-search-foundations (Team:Search Foundations)

elasticsearchmachine · 2025-09-23T16:54:25Z

Hi @chrisparrinello, I've created a changelog YAML for you.

javanna

I left a couple of comments, DFS looks good, can match needs some adjustments. Thanks!

javanna · 2025-09-24T18:22:36Z

server/src/main/java/org/elasticsearch/action/search/SearchRequestAttributesExtractor.java

+            isSystem = ((EsExecutors.EsThread) Thread.currentThread()).isSystem();
+        } else {
+            isSystem = false;
+        }


is this fixing an issue you experienced ? I thought that we are guaranteed to always have an instance of EsThread here.

Yes, the issue is that a lot of tests start failing because the search is being run on a generic Thread and not on EsThread and cause ClassCastException errors.

Would a better approach be to try and use EsExecutor.EsThreads when running those tests? Not sure what the consequences would be but might be better than that instanceof check.

I am a bit confused on this: isn't this existing code? does this fail on main then or only in this PR without your change?

I backed out this change and so far CI tests are passing. I think I ran into this problem when I was working on the coordinator level metrics so I'll tackle it again when I create the PR for those changes.

it does not seem like you backed it out, the change is still in the diff?

javanna · 2025-09-24T18:28:07Z

server/src/main/java/org/elasticsearch/search/SearchService.java

                        maxKeepAlive
                    );
                    CanMatchShardResponse canMatchResp = canMatch(canMatchContext, false);
+                    opsListener.onCanMatchPhase(orig, System.nanoTime() - beforeCanMatchTime);


this isn't the right can match call to instrument :) this is indeed confusing. we call can match in multiple places, and this one is as part of the query phase itself. What I'd like to track is the can match latency as part of the separate can match roundtrip we have. CanMatchPreFilterSearchPhase. The originating call on data nodes is SearchService#canMatch(CanMatchNodeRequest request, ActionListener<CanMatchNodeResponse> listener). I would think that the tracking fits in that specific method.

Got it. I'll move the instrumenting there. 🤞 the opsListener is available there. That also seems to be a sticking point with some of this stuff.

I moved the call but as a result, I had to inject the ShardSearchPhaseAPMMetrics singleton into the Transport*Action classes.

You are now tracking the coordinator rewrite phase, which is something else :) Can you check the method I pointed you to above? I believe that's where we need to record the execution time for can match on the shards: SearchService#canMatch(CanMatchNodeRequest request, ActionListener<CanMatchNodeResponse> listener).

this stuff is quite hard to follow in the code. Can match is run first on the coord node, as a rewrite step to filter out indices that the coord node knows cannot possibly match. We don't want to track the latency of this, because it does not involve roundtrips to the nodes. Next, an optional can match search phase is executed which goes to all data nodes involved and tries to filter out shards that cannot possibly match based on metadata that's only available in the data nodes (without executing the query!). This is the round that we want to measure latency for. Both at the coord node (later) and at the shard level (in this PR). note that the method I pointed you to is executed on each data node, and loops over all the shards that are searched which are allocated on that data node. I am thinking we should track the can match phase at the data node level here, as opposed to for each shard. Makes sense?

server/src/main/java/org/elasticsearch/index/shard/SearchOperationListener.java

...src/test/java/org/elasticsearch/search/TelemetryMetrics/ShardSearchPhaseAPMMetricsTests.java

server/src/main/java/org/elasticsearch/action/search/SearchRequestAttributesExtractor.java

smalyshev · 2025-09-24T18:41:01Z

server/src/main/java/org/elasticsearch/search/SearchService.java

        assert request.canReturnNullResponseIfMatchNoDocs() == false || request.numberOfShards() > 1
            : "empty responses require more than one shard";
        final IndexShard shard = getShard(request);
+        final var opsListener = shard.getSearchOperationListener();


I wonder why aren't we using SearchTimeProvider? Maybe not relevant for this patch, just curious.

I think during a few iterations of me working on this, SearchTimeProvider isn't accessible (or even exists) in some phases and would have to be pushed down the call chain. It seemed like a lot of work for something that wraps System.nanoTime calls and we're not testing the durations so we don't need a mock provider with deterministic times or something like that to assert against.

How would you use search time provider to track a long an operation took in a specific shard? Doesn't that provide the start time of the search on the coord node?

@javanna SearchTimeProvider.relativeCurrentTimeNanosProvider is set to System::nanoTime in the code.

server/src/main/java/org/elasticsearch/index/shard/SearchOperationListener.java

smalyshev

Generally looks ok, left some nitpick comments. I'm not familiar with these parts, so I'll let Luca confirm that's what we need.

smalyshev · 2025-09-26T23:36:53Z

server/src/main/java/org/elasticsearch/action/search/CanMatchPreFilterSearchPhase.java

        assert assertSearchCoordinationThread();
        final List<SearchShardIterator> matchedShardLevelRequests = new ArrayList<>();
        for (SearchShardIterator searchShardIterator : shardsIts) {
+            long startTime = System.nanoTime();


Don't see this used anywhere?

Right. It's not. This was from a previous iteration of this PR. I'll remove.

smalyshev · 2025-09-26T23:45:08Z

server/src/main/java/org/elasticsearch/index/IndexService.java

    }

-    List<SearchOperationListener> getSearchOperationListener() { // pkg private for testing
+    public List<SearchOperationListener> getSearchOperationListener() { // pkg private for testing


Probably the better name would be to use plural: getSearchOperationListeners. It took me some minutes to realize it actually returns a list of listeners and not a single listener. Also, the comment is no longer accurate.

Agreed. Will change.

smalyshev · 2025-09-26T23:47:59Z

server/src/main/java/org/elasticsearch/search/SearchService.java

                    indexService = readerContext.indexService();
                    QueryRewriteContext queryRewriteContext = canMatchContext.getQueryRewriteContext(indexService);
                    if (queryStillMatchesAfterRewrite(canMatchContext.request, queryRewriteContext) == false) {
+                        indexService.getSearchOperationListener()


This particular construct repeats in several places, maybe make a private method out of it? Or just place it in the finally clause? Or maybe both?

smalyshev · 2025-09-26T23:54:33Z

...src/test/java/org/elasticsearch/search/TelemetryMetrics/ShardSearchPhaseAPMMetricsTests.java

+        assertEquals(0, queryMeasurements.size());
+        final List<Measurement> fetchMeasurements = getTestTelemetryPlugin().getLongHistogramMeasurement(FETCH_SEARCH_PHASE_METRIC);
+        assertEquals(0, fetchMeasurements.size());
+    }


One thing that is missing for me in these tests that it doesn't seem to check actual measurements - it only checks size() but not the measurements themselves. Is this enough? I know it's hard to predict timings, at least on tests, but maybe at least ensure it's not always zero? I think we have special query syntax also to make the query slower - not sure if it'd apply here... or maybe some transport behaviors.

Another question I'm curious about - can any of these phases fail? Are we interested in knowing if they do?

Yeah the check of the values inside the unit tests are going to be problematic because the unit for the measurements are milliseconds and it is highly unlikely we'll have a duration longer than a millisecond for some of these measurements. I did check while running via gradlew run and there are non-zero measurements for some of these in the APM server logs.

smalyshev · 2025-09-27T00:40:14Z

server/src/main/java/org/elasticsearch/node/NodeConstruction.java

            b.bind(ShutdownPrepareService.class).toInstance(shutdownPrepareService);
            b.bind(OnlinePrewarmingService.class).toInstance(onlinePrewarmingService);
            b.bind(MergeMetrics.class).toInstance(mergeMetrics);
+            b.bind(ShardSearchPhaseAPMMetrics.class).toInstance(shardSearchPhaseAPMMetrics);


I am curious - why this addition is necessary, what is it doing? It's not a new class, so what changed?

This might be leftover from a previous iteration where I wanted to inject into one of the services (IndicesService?) and guarantee it is a singleton. I'll check to see if it is safe to remove which I think it is at this point.

I think it is safe to remove so I'll remove it.

javanna · 2025-09-29T19:15:01Z

server/src/main/java/org/elasticsearch/index/search/stats/ShardSearchPhaseAPMMetrics.java

+    @Override
+    public void onDfsPhase(SearchContext searchContext, long tookInNanos) {
+        SearchExecutionContext searchExecutionContext = searchContext.getSearchExecutionContext();
+        Long rangeTimestampFrom = searchExecutionContext.getRangeTimestampFrom();


Did you see that the DFS phase reports the time range attribute? I would think this does nothing, cause I would not expect DFS to actually parse the query, it will only collect statistics and run the knn search. Unless I am missing something, that would be a reason to just pass null for DFS too.

I am also a bit way of getting the search execution context. I would not be surprised if there are cases when that is null.

I ran a check and it doesn't report the time range given one of those time range queries from the unit tests. Would it be safer to just record the per shard took time and not try to record any of the search attributes? I haven't seen any cases so far where the search execution context is null.

server/src/main/java/org/elasticsearch/node/NodeConstruction.java

javanna · 2025-09-29T19:32:26Z

server/src/main/java/org/elasticsearch/search/SearchService.java

+                    canMatchContext.getIndexService().getSearchOperationListeners(),
+                    origShardSearchRequest,
+                    startTime
+                );


I would like to record this only in one place, the can match phase that requires a separate roundtrip. This seems to record the metric in many different places which is not necessary. The method that takes an action listener should do it. That's the only place where we need to record it. See previous comment at #135285 (comment) .

The rationale for doing it there only is to focus on the actual can match phase that requires a roundtrip to the data nodes, and have some correlation between coord node latency of that, and data node latency of that.

In the method I pointed you to, you'll find a loop. That code executes on each data node, looping over the shard targets. I am now even wondering if we want to track the latency at the shard level, or at the data node level. The latter may make more sense. We don't need shard granularity here, it's overkill.

That does complicate things around attributes as you won't have a ShardSearchRequest. I'd change the signature of the listener method to take a CanMatchNodeRequest. Introspecting that should work, what is not available there we can probably skip. This does require a bit more plumbing in SearchRequestAttributesExtractor. I prefer it over cloning the shard request too, which is a bit intrusive.

In hindsight, it would have been wiser to split dfs and can match. I would still do that. Get DFS in as it's the simplest, then focus on can match and see if that can also be split up further?

Okay I'll pull out the DFS instrumentation into a separate PR as a (hopefully) quick merge and then look into the can-match changes you suggested above.

oh, doing it per data node, takes away the ability to rely on SearchOperationListener for the callback, as that is per index/shard (depending on where it's retrieved from between IndexService or IndexShard). Maybe that creates too many artificial problems, as we don't have a listener for data node level search events.

This deserves a bit more thinking. Let me know if the trade-offs here make sense to you? I have yet to make up my mind entirely :)

Yeah having access to the SearchOperationListener via something to be able to call indirectly the ShardSearchPhaseAPMMetrics object and have the ShardSearchRequest to extract the attributes for the metric have been a little bit of a struggle with this.

If we want to have a per data node can-match metric, I'd do via another APM metrics handling class to keep the ShardSearchPhaseAPMMetrics just related to shard metrics anyway. The problem with that is getting access to that in some of the canMatch methods. I've been down this road a little bit with injecting the APMMetrics objects into the service objects in the Guice-y stuff in the NodeConstructor.

elasticsearchmachine added the v9.2.0 label Sep 23, 2025

chrisparrinello force-pushed the canmatch_dfs_shard_metrics branch from d22b164 to 6624bad Compare September 23, 2025 15:14

chrisparrinello changed the title ~~dfs metrics~~ Sep 23, 2025

chrisparrinello marked this pull request as ready for review September 23, 2025 16:50

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Sep 23, 2025

chrisparrinello added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed needs:triage Requires assignment of a team area label labels Sep 23, 2025

elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Sep 23, 2025

chrisparrinello added :Search Foundations/Search Catch all for Search Foundations and removed needs:triage Requires assignment of a team area label labels Sep 23, 2025

elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Sep 23, 2025

chrisparrinello added >enhancement and removed Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Sep 23, 2025

elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Sep 23, 2025

chrisparrinello requested review from javanna and smalyshev September 23, 2025 16:54

javanna reviewed Sep 24, 2025

View reviewed changes

smalyshev reviewed Sep 24, 2025

View reviewed changes

drempapis reviewed Sep 25, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/shard/SearchOperationListener.java Outdated Show resolved Hide resolved

drempapis reviewed Sep 25, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/shard/SearchOperationListener.java Show resolved Hide resolved

chrisparrinello requested a review from a team as a code owner September 25, 2025 15:44

chrisparrinello requested review from drempapis, javanna and smalyshev September 25, 2025 17:06

smalyshev reviewed Sep 26, 2025

View reviewed changes

smalyshev reviewed Sep 27, 2025

View reviewed changes

dfs metrics

8ecb480

chrisparrinello and others added 11 commits September 29, 2025 11:39

can match per shard metrics

fcf3650

Update docs/changelog/135285.yaml

5b55c88

backout instanceof check for CI tests

408d7ae

move can-match instrument to CanMatchPreFilterSearchPhase

010b1d3

remove unneeded assertions

4e9c78f

moved can-match instrumentation to SearchService::canMatch method

4b0158b

workaround for failed tests

8b4ce39

fix tests broken due to no IndexService in CanMatchRequest

dc74caa

get working with main, punt on rangeOnTimestampFrom

7264b19

fix tests

298666d

spotless apply

d43748a

chrisparrinello force-pushed the canmatch_dfs_shard_metrics branch from 3dd8c0c to d43748a Compare September 29, 2025 16:39

chrisparrinello and others added 2 commits September 29, 2025 12:21

Merge branch 'main' into canmatch_dfs_shard_metrics

3d1ca17

PR fixes

0348ba6

javanna reviewed Sep 29, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/node/NodeConstruction.java Show resolved Hide resolved

javanna reviewed Sep 29, 2025

View reviewed changes

chrisparrinello mentioned this pull request Sep 29, 2025

DFS search phase per shard duration APM metric #135652

Merged

elasticsearchmachine added v9.3.0 and removed v9.2.0 labels Oct 2, 2025

chrisparrinello closed this Oct 22, 2025

Conversation

chrisparrinello commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

elasticsearchmachine commented Sep 23, 2025

elasticsearchmachine commented Sep 23, 2025

javanna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

smalyshev left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

javanna Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Labels

5 participants

chrisparrinello commented Sep 23, 2025 •

edited

Loading

javanna Sep 29, 2025 •

edited

Loading