Skip to content

[ES|QL] Enhances grok semantics extraction with onigurama regex pattern#229409

Merged
stratoula merged 2 commits intoelastic:mainfrom
stratoula:enhance-grok-onigurama-patterns
Jul 28, 2025
Merged

[ES|QL] Enhances grok semantics extraction with onigurama regex pattern#229409
stratoula merged 2 commits intoelastic:mainfrom
stratoula:enhance-grok-onigurama-patterns

Conversation

@stratoula
Copy link
Contributor

@stratoula stratoula commented Jul 25, 2025

Summary

Closes #229195

It enhances the grok semantics extraction taking under consideration the Oniguruma's schema. I tested it with examples from here https://www.elastic.co/guide/en/elasticsearch/reference/8.18/esql-process-data-with-dissect-and-grok.html#esql-grok-regex

It is definitely covering even more cases. Does it cover everything? I am not sure. But it is def an improvement

image

Checklist

@stratoula stratoula added Feature:ES|QL ES|QL related features in Kibana Team:ESQL ES|QL related features in Kibana t// v9.2.0 release_note:enhancement backport:skip This PR does not require backporting labels Jul 25, 2025
@stratoula stratoula marked this pull request as ready for review July 25, 2025 09:26
@stratoula stratoula requested a review from a team as a code owner July 25, 2025 09:27
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-esql (Team:ESQL)

@elasticmachine
Copy link
Contributor

⏳ Build in-progress, with failures

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #50 / Cloud Security Posture Test adding Cloud Security Posture Integrations KSPM K8S KSPM K8S KSPM K8S Workflow
  • [job] [logs] FTR Configs #50 / Cloud Security Posture Test adding Cloud Security Posture Integrations KSPM K8S KSPM K8S KSPM K8S Workflow
Copy link
Contributor

@bartoval bartoval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// Regex for Oniguruma-style named capture groups (?<name>...) or (?'name'...)
// Oniguruma supports both `?<name>` and `?'name'` for named capture groups.
const onigurumaNamedCaptureRegex = /(?<column>\(\?<(\w+)>|\(\?'(\w+)'\)[^)]*\))/g;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a curiosity: is (?<column> necessary or it can be just /\(\?<(\w+)>|\(\?'(\w+)'[^)]*\)/ ? I see we look for onigurumaMatch[2] and onigurumaMatch[3], so this is my doubt

Copy link
Contributor Author

@stratoula stratoula Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(?...) is useful for direct named access to a specific part of a match, but is not necessary. But is useful so I will keep it

@stratoula stratoula merged commit 94b3c96 into elastic:main Jul 28, 2025
12 checks passed
delanni pushed a commit to delanni/kibana that referenced this pull request Aug 5, 2025
…rn (elastic#229409)

## Summary

Closes elastic#229195

It enhances the grok semantics extraction taking under consideration the
Oniguruma's schema. I tested it with examples from here
https://www.elastic.co/guide/en/elasticsearch/reference/8.18/esql-process-data-with-dissect-and-grok.html#esql-grok-regex

It is definitely covering even more cases. Does it cover everything? I
am not sure. But it is def an improvement

<img width="698" height="117" alt="image"
src="https://github.com/user-attachments/assets/5e44fc48-4a62-4e42-9a9e-56bd7be80851"
/>


### Checklist
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting Feature:ES|QL ES|QL related features in Kibana release_note:enhancement Team:ESQL ES|QL related features in Kibana t// v9.2.0

3 participants