ES|QL: Improve generative tests for FORK [130015]#131206
ES|QL: Improve generative tests for FORK [130015]#131206svilen-mihaylov-elastic merged 9 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
luigidellaquila
left a comment
There was a problem hiding this comment.
LGTM, thanks @svilen-mihaylov-elastic
...c/main/java/org/elasticsearch/xpack/esql/qa/rest/generative/command/pipe/WhereGenerator.java
Show resolved
Hide resolved
| final String command = current.commandString(); | ||
|
|
||
| // Try appending new command to parent of Fork. If we successfully execute (without exception) AND still retain the same | ||
| // schema (all Fork branches must have the same schema), we append the command. |
There was a problem hiding this comment.
Is this a strict constraint? Isn't it enough that there are no type conflicts between branches?
There was a problem hiding this comment.
We would generate more interesting commands if we did not have this restriction here.
Right now AFAICS when I run this, the FORK branches contain mostly WHERE/MV_EXPAND/SORT and ENRICH sometimes.
FORK branches don't need to have the same schema - we just need to be sure that if a column is present in multiple branches, it has the same data type everywhere.
There was a problem hiding this comment.
Yes correct, the schemas do not need to be exactly the same, just the overlapping columns need to have the same type.
I spend some time thinking about this and decided to go with the simple solution (for now). I think it will be challenging to allow different schemas which have compatible types AND allow for (mostly independent) fork sub-pipelines. Particularly for non-trivial subpipelines (> 5 stages), it will be non-trivial to keep adding random stages and checking if the types remain compatible. This may lead to a lot of repetitions and discarded results which will make the test run (possibly) a lot slower.
I will update the comment., and in any case, options are open later to iterate to further on this condition to balance coverage of Fork sub-pipelines and performance of the test itself.
There was a problem hiding this comment.
If I remove the check above, I get something like this:
line 1:1: Column [@timestamp] has conflicting data types in FORK branches: [UNSUPPORTED] and [KEYWORD]
line 1:940: EVAL does not support type [counter_long] as the return data type of expression [from sample_d*,*]
I ignore those errors by adding the patterns to the ALLOWED_ERRORS list. How does this sound?
...rc/main/java/org/elasticsearch/xpack/esql/qa/rest/generative/command/pipe/ForkGenerator.java
Outdated
Show resolved
Hide resolved
| final String command = current.commandString(); | ||
|
|
||
| // Try appending new command to parent of Fork. If we successfully execute (without exception) AND still retain the same | ||
| // schema (all Fork branches must have the same schema), we append the command. |
There was a problem hiding this comment.
We would generate more interesting commands if we did not have this restriction here.
Right now AFAICS when I run this, the FORK branches contain mostly WHERE/MV_EXPAND/SORT and ENRICH sometimes.
FORK branches don't need to have the same schema - we just need to be sure that if a column is present in multiple branches, it has the same data type everywhere.
💔 Backport failedThe backport operation could not be completed due to the following error: You can use sqren/backport to manually backport by running |
…king * upstream/main: (100 commits) Term vector API on stateless search nodes (elastic#129902) TEST Fix ThreadPoolMergeSchedulerStressTestIT testMergingFallsBehindAndThenCatchesUp (elastic#131636) Add inference.put_custom rest-api-spec (elastic#131660) ESQL: Fewer serverless docs in tests (elastic#131651) Skip search on indices with INDEX_REFRESH_BLOCK (elastic#129132) Mute org.elasticsearch.indices.cluster.RemoteSearchForceConnectTimeoutIT testTimeoutSetting elastic#131656 [jdk] Resolve EA OpenJDK builds to our JDK archive (elastic#131237) Add optimized path for intermediate values aggregator (elastic#131390) Correctly handling download_database_on_pipeline_creation within a pipeline processor within a default or final pipeline (elastic#131236) Refresh potential lost connections at query start for `_search` (elastic#130463) Add template_id to patterned-text type (elastic#131401) Integrate LIKE/RLIKE LIST with ReplaceStringCasingWithInsensitiveRegexMatch rule (elastic#131531) [ES|QL] Add doc for the COMPLETION command (elastic#131010) ESQL: Add times to topn status (elastic#131555) ESQL: Add asynchronous pre-optimization step for logical plan (elastic#131440) ES|QL: Improve generative tests for FORK [130015] (elastic#131206) Update index mapping update privileges (elastic#130894) ESQL: Added Sample operator NamedWritable to plugin (elastic#131541) update `kibana_system` to grant it access to `.chat-*` system index (elastic#131419) Clarify heap size configuration (elastic#131607) ...
…-tracking * upstream/main: (44 commits) Term vector API on stateless search nodes (elastic#129902) TEST Fix ThreadPoolMergeSchedulerStressTestIT testMergingFallsBehindAndThenCatchesUp (elastic#131636) Add inference.put_custom rest-api-spec (elastic#131660) ESQL: Fewer serverless docs in tests (elastic#131651) Skip search on indices with INDEX_REFRESH_BLOCK (elastic#129132) Mute org.elasticsearch.indices.cluster.RemoteSearchForceConnectTimeoutIT testTimeoutSetting elastic#131656 [jdk] Resolve EA OpenJDK builds to our JDK archive (elastic#131237) Add optimized path for intermediate values aggregator (elastic#131390) Correctly handling download_database_on_pipeline_creation within a pipeline processor within a default or final pipeline (elastic#131236) Refresh potential lost connections at query start for `_search` (elastic#130463) Add template_id to patterned-text type (elastic#131401) Integrate LIKE/RLIKE LIST with ReplaceStringCasingWithInsensitiveRegexMatch rule (elastic#131531) [ES|QL] Add doc for the COMPLETION command (elastic#131010) ESQL: Add times to topn status (elastic#131555) ESQL: Add asynchronous pre-optimization step for logical plan (elastic#131440) ES|QL: Improve generative tests for FORK [130015] (elastic#131206) Update index mapping update privileges (elastic#130894) ESQL: Added Sample operator NamedWritable to plugin (elastic#131541) update `kibana_system` to grant it access to `.chat-*` system index (elastic#131419) Clarify heap size configuration (elastic#131607) ...
Addresses #130015