ESQL: Multiple patterns for grok command by flash1293 · Pull Request #136541 · elastic/elasticsearch

flash1293 · 2025-10-14T12:30:52Z

This PR adds the ability to specify multiple grok patterns as part of a single grok command. Consistent with the grok processor for ingest pipelines, they are tried in order, the first matching one is actually applied:

POST _query
{
  "query": """
    ROW col1="123 This is a test" | GROK col1 "%{UUID:def}", "%{WORD:xxx}"
  """
}

returns

       col1       |      def      |      xxx
------------------+---------------+---------------
123 This is a test|null           |123

It's not allowed to have different types for the same semantic names in different patterns:

POST _query
{
  "query": """
    ROW col1="123 This is a test" | GROK col1 "%{UUID:def}", "%{INT:def}"
  """
}

returns

{"error":{"root_cause":[{"type":"parsing_exception","reason":"line 1:33: Invalid GROK pattern [(?:%{UUID:def})|(?:%{INT:def})]: the attribute [def] is defined multiple times with different types"}],"type":"parsing_exception","reason":"line 1:33: Invalid GROK pattern [(?:%{UUID:def})|(?:%{INT:def})]: the attribute [def] is defined multiple times with different types"},"status":400}

This can be considered syntactic sugar over a more complex manual pattern.

…93/elasticsearch into flash1293/grok-multiple-patterns

…iple-patterns

flash1293 · 2025-10-16T14:33:46Z

Hmm, this fails BWC tests with a mixed cluster, which is expected since this is introducing a new syntax. I'm not sure how we handle this case, could you advise?

luigidellaquila

Thanks @flash1293, LGTM

I left just a couple of minor observations

x-pack/plugin/esql/qa/testFixtures/src/main/resources/grok.csv-spec

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java

elasticsearchmachine · 2025-10-16T17:30:41Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

luigidellaquila · 2025-10-17T09:12:18Z

Hmm, this fails BWC tests with a mixed cluster, which is expected since this is introducing a new syntax. I'm not sure how we handle this case, could you advise?

Sorry, I missed this comment.
You'll have to add a new capability in EsqlCapabilities and then add required_capability: your_new_capability (lowercase) to the CSV test.

…iple-patterns

elasticsearchmachine · 2025-10-17T11:21:38Z

Hi @flash1293, I've created a changelog YAML for you.

ivancea · 2025-10-17T11:44:57Z

libs/grok/src/main/java/org/elasticsearch/grok/Grok.java

+        if (patterns.size() > 1) {
+            combinedPattern = "";
+            for (int i = 0; i < patterns.size(); i++) {
+                String pattern = patterns.get(i);


A bit of an edge case, but what would happen if some pattern is invalid? For example:

%{WORD:word}\

a)

%{WORD:word}

The final result would be: (?:%{WORD:word}\)|(?:a))|(?:%{WORD:word})

Which lead to unexpected patterns.

This is user-made, so I don't think this is very problematic, but this looks like a grok injection and the method combinePatterns() would actually be lying here.

So, solutions:

Can we verify and warn on wrong patterns? Is that something we do in some other case?

Can we sanitize or remove invalid patterns?

Whatever the resolution, we'll need tests for this, both in Grok and in ESQL, depending on what we do

That's a fun one - tested with ingest pipelines and this indeed works here:

POST /_ingest/pipeline/_simulate { "docs": [ { "_source": { "foo": "Test)x" } } ], "pipeline": { "processors": [ { "grok": { "field": "foo", "patterns": [ "%{WORD:word}\\", "x)" ] } } ] } }

should fail because %{WORD:word}\\ and x) are not valid patterns, but it does pass, no warnings or similar.

I don't feel strongly about this case, we could pass the individual patterns down further and then validate this in the Grok lib, but:

It seems like a lot of work for an edge case

It's existing behavior in ingest pipelines

We would need to compile each regex individually, which sounds like it would add a bunch of additional computation we are avoiding now

I would lean towards leaving the current behavior, wdyt?

As the code was centralized, I guess the pipeline uses (and already used) the same logic right?

Being pragmatic, validating multiple patterns should error (probably?), which means that validating a single pattern should error too. So technically this is existing behavior, and fixing would be a breaking change (Or a bug fix).
So yeah, let's just make an issue explaining this case (For ESQL at least), and continue with this

As the code was centralized, I guess the pipeline uses (and already used) the same logic right?

Exactly!

So technically this is existing behavior, and fixing would be a breaking change

I never get tired of quoting Hyrums Law 🙂

Filed #136750

multiple patterns for grok

f39c656

elasticsearchmachine added v9.3.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Oct 14, 2025

elasticsearchmachine and others added 6 commits October 14, 2025 12:38

[CI] Auto commit changes from spotless

b9c1bce

remove unused files

9f4174e

Merge branch 'flash1293/grok-multiple-patterns' of github.com:flash12…

2cd7ffe

…93/elasticsearch into flash1293/grok-multiple-patterns

Merge remote-tracking branch 'upstream/main' into flash1293/grok-mult…

405c40b

…iple-patterns

Merge remote-tracking branch 'upstream/main' into flash1293/grok-mult…

0a8bdf3

…iple-patterns

add some tests and stuff

3de1466

flash1293 marked this pull request as ready for review October 16, 2025 14:10

elasticsearchmachine added the needs:triage Requires assignment of a team area label label Oct 16, 2025

[CI] Auto commit changes from spotless

f0d7429

luigidellaquila approved these changes Oct 16, 2025

View reviewed changes

x-pack/plugin/esql/qa/testFixtures/src/main/resources/grok.csv-spec Show resolved Hide resolved

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/LogicalPlanBuilder.java Outdated Show resolved Hide resolved

benwtrent added :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels Oct 16, 2025

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 16, 2025

flash1293 added 2 commits October 17, 2025 11:53

Merge remote-tracking branch 'upstream/main' into flash1293/grok-mult…

8a90d88

…iple-patterns

review comments

e85d048

flash1293 added the >enhancement label Oct 17, 2025

Update docs/changelog/136541.yaml

b23d798

ivancea reviewed Oct 17, 2025

View reviewed changes

flash1293 mentioned this pull request Oct 17, 2025

ESQL/Ingest pipeline: Multiple grok patterns can leak into each other #136750

Open

flash1293 merged commit 96396b4 into elastic:main Oct 17, 2025
34 checks passed

luigidellaquila mentioned this pull request Oct 24, 2025

ES|QL: Validate multiple GROK patterns individually #137082

Merged

mdbirnstiehl mentioned this pull request Nov 12, 2025

[Streams] Update manual pipeline configuration processor elastic/docs-content#3915

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Multiple patterns for grok command#136541

ESQL: Multiple patterns for grok command#136541
flash1293 merged 11 commits intoelastic:mainfrom
flash1293:flash1293/grok-multiple-patterns

flash1293 commented Oct 14, 2025 •

edited

Loading

flash1293 commented Oct 16, 2025

luigidellaquila left a comment

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 16, 2025

luigidellaquila commented Oct 17, 2025

elasticsearchmachine commented Oct 17, 2025

ivancea Oct 17, 2025

ivancea Oct 17, 2025

flash1293 Oct 17, 2025 •

edited

Loading

ivancea Oct 17, 2025

flash1293 Oct 17, 2025

flash1293 Oct 17, 2025

Uh oh!

Labels

5 participants

Conversation

flash1293 commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

flash1293 commented Oct 16, 2025

luigidellaquila left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 16, 2025

luigidellaquila commented Oct 17, 2025

elasticsearchmachine commented Oct 17, 2025

ivancea Oct 17, 2025

Choose a reason for hiding this comment

ivancea Oct 17, 2025

Choose a reason for hiding this comment

flash1293 Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

ivancea Oct 17, 2025

Choose a reason for hiding this comment

flash1293 Oct 17, 2025

Choose a reason for hiding this comment

flash1293 Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

5 participants

flash1293 commented Oct 14, 2025 •

edited

Loading

flash1293 Oct 17, 2025 •

edited

Loading