ESQL: Add `documents_found` and `values_loaded` (#125631) by nik9000 · Pull Request #130029 · elastic/elasticsearch

nik9000 · 2025-06-25T17:28:47Z

This adds documents_found and values_loaded to the to the ESQL response:

{
  "took" : 194,
  "is_partial" : false,
  "documents_found" : 100000,
  "values_loaded" : 200000,
  "columns" : [
    { "name" : "a", "type" : "long" },
    { "name" : "b", "type" : "long" }
  ],
  "values" : [[10, 1]]
}

These are cheap enough to collect that we can do it for every query and return it with every response. It's small, but it still gives you a reasonable sense of how much work Elasticsearch had to go through to perform the query.

I've also added these two fields to the driver profile and task status:

    "drivers" : [
      {
        "description" : "data",
        "cluster_name" : "runTask",
        "node_name" : "runTask-0",
        "start_millis" : 1742923173077,
        "stop_millis" : 1742923173087,
        "took_nanos" : 9557014,
        "cpu_nanos" : 9091340,
        "documents_found" : 5,   <---- THESE
        "values_loaded" : 15,    <---- THESE
        "iterations" : 6,
...

These are at a high level and should be easy to reason about. We'd like to extract this into a "show me how difficult this running query is" API one day. But today, just plumbing it into the debugging output is good.

Any Operator can claim to "find documents" or "load values" by overriding a method on its Operator.Status implementation:

/**
 * The number of documents found by this operator. Most operators
 * don't find documents and will return {@code 0} here.
 */
default long documentsFound() {
    return 0;
}

/**
 * The number of values loaded by this operator. Most operators
 * don't load values and will return {@code 0} here.
 */
default long valuesLoaded() {
    return 0;
}

In this PR all of the LuceneOperators declare that each position they emit is a "document found" and the ValuesSourceValuesSourceReaderOperator says each value it makes is a "value loaded". That's pretty pretty much true. The LuceneCountOperator and LuceneMinMaxOperator sort of pretend that the count/min/max that they emit is a "document" - but that's good enough to give you a sense of what's going on. It's like document.

This adds `documents_found` and `values_loaded` to the to the ESQL response: ```json { "took" : 194, "is_partial" : false, "documents_found" : 100000, "values_loaded" : 200000, "columns" : [ { "name" : "a", "type" : "long" }, { "name" : "b", "type" : "long" } ], "values" : [[10, 1]] } ``` These are cheap enough to collect that we can do it for every query and return it with every response. It's small, but it still gives you a reasonable sense of how much work Elasticsearch had to go through to perform the query. I've also added these two fields to the driver profile and task status: ```json "drivers" : [ { "description" : "data", "cluster_name" : "runTask", "node_name" : "runTask-0", "start_millis" : 1742923173077, "stop_millis" : 1742923173087, "took_nanos" : 9557014, "cpu_nanos" : 9091340, "documents_found" : 5, <---- THESE "values_loaded" : 15, <---- THESE "iterations" : 6, ... ``` These are at a high level and should be easy to reason about. We'd like to extract this into a "show me how difficult this running query is" API one day. But today, just plumbing it into the debugging output is good. Any `Operator` can claim to "find documents" or "load values" by overriding a method on its `Operator.Status` implementation: ```java /** * The number of documents found by this operator. Most operators * don't find documents and will return {@code 0} here. */ default long documentsFound() { return 0; } /** * The number of values loaded by this operator. Most operators * don't load values and will return {@code 0} here. */ default long valuesLoaded() { return 0; } ``` In this PR all of the `LuceneOperator`s declare that each `position` they emit is a "document found" and the `ValuesSourceValuesSourceReaderOperator` says each value it makes is a "value loaded". That's pretty pretty much true. The `LuceneCountOperator` and `LuceneMinMaxOperator` sort of pretend that the count/min/max that they emit is a "document" - but that's good enough to give you a sense of what's going on. It's *like* document.

github-actions · 2025-06-25T17:28:56Z

Documentation preview:

✨ Changed pages

nik9000 · 2025-06-25T17:29:32Z

Backport of #125631 to 8.19 long after the fact. Needed by #128828. Needs a change to main with the new transport version. That's incoming.

nik9000 · 2025-06-25T17:30:37Z

Also needs a manual review from me to make sure the backport is truly just what it should be. It wasn't clean at all and I had to make a lot of modifications. I have to double check those are sane.

nik9000

Seems right to me modulo the three things I found. I'll fix those in a moment.

nik9000 · 2025-06-25T17:58:24Z

.../plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/TransportEsqlQueryAction.java

                columns,
                result.pages(),
+                result.completionInfo().documentsFound(),
+                result.completionInfo().documentsFound(),


This looks wrong. And it's wrong in main too!

nik9000 · 2025-06-25T17:59:06Z

...ck/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/action/EsqlQueryResponseTests.java

            default -> throw new IllegalArgumentException();
-        };
+        }
+        ;


Also in main. nuking.

nik9000 · 2025-06-25T18:00:34Z

@idegtiarenko, could you have a look at this and double check it against the original PR? It looks right to me, but I'd appreciate a second set of eyes. And you are the one who needs this backport in so you get to suffer a little.

nik9000 added backport :Analytics/ES|QL AKA ESQL v8.19.0 labels Jun 25, 2025

nik9000 commented Jun 25, 2025

View reviewed changes

nik9000 requested a review from idegtiarenko June 25, 2025 18:00

nik9000 added 2 commits June 25, 2025 14:40

Merge branch '8.19' into esql_documents_found_8_19

944f712

Clean

e199fc9

nik9000 marked this pull request as ready for review June 25, 2025 19:44

nik9000 added 3 commits June 25, 2025 15:55

Merge branch '8.19' into esql_documents_found_8_19

9cc6d52

Merge branch '8.19' into esql_documents_found_8_19

61ca3d7

Update docs

ab635c7

idegtiarenko approved these changes Jun 26, 2025

View reviewed changes

nik9000 merged commit 0acda3a into elastic:8.19 Jun 26, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Add `documents_found` and `values_loaded` (#125631)#130029

ESQL: Add `documents_found` and `values_loaded` (#125631)#130029
nik9000 merged 6 commits intoelastic:8.19from
nik9000:esql_documents_found_8_19

nik9000 commented Jun 25, 2025

github-actions bot commented Jun 25, 2025

nik9000 commented Jun 25, 2025

nik9000 commented Jun 25, 2025

nik9000 left a comment

nik9000 Jun 25, 2025

nik9000 Jun 25, 2025

nik9000 Jun 25, 2025

nik9000 commented Jun 25, 2025

Uh oh!

Labels

2 participants

Conversation

nik9000 commented Jun 25, 2025

github-actions bot commented Jun 25, 2025

nik9000 commented Jun 25, 2025

nik9000 commented Jun 25, 2025

nik9000 left a comment

Choose a reason for hiding this comment

nik9000 Jun 25, 2025

Choose a reason for hiding this comment

nik9000 Jun 25, 2025

Choose a reason for hiding this comment

nik9000 Jun 25, 2025

Choose a reason for hiding this comment

nik9000 commented Jun 25, 2025

Uh oh!

Labels

2 participants