-
Notifications
You must be signed in to change notification settings - Fork 28.7k
feat(turbopack): turbo-esregex support captures method #81119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: canary
Are you sure you want to change the base?
feat(turbopack): turbo-esregex support captures method #81119
Conversation
Allow CI Workflow Run
Note: this should only be enabled once the PR is ready to go and can only be enabled by a maintainer |
2 similar comments
Allow CI Workflow Run
Note: this should only be enabled once the PR is ready to go and can only be enabled by a maintainer |
Allow CI Workflow Run
Note: this should only be enabled once the PR is ready to go and can only be enabled by a maintainer |
CodSpeed Performance ReportMerging #81119 will not alter performanceComparing Summary
Benchmarks breakdown
|
match &self.delegate { | ||
EsRegexImpl::Regex(r) => r.captures(haystack).map(|caps| { | ||
caps.iter() | ||
.map(|m| m.map(|m| m.as_str().to_string()).unwrap_or_default()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can avoid the .to_string()
conversion and make the return type of this function Option<Vec<&'h str>>
(where 'h
is the lifetime of haystack
). It seems like both regex
and regress
support such an API. The captures are just slices of the haystack.
EsRegexImpl::Regex(r) => r.captures(haystack).map(|caps| { | ||
caps.iter() | ||
.map(|m| m.map(|m| m.as_str().to_string()).unwrap_or_default()) | ||
.collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: Instead of collect()
ing here, it might be neat to try to make the return type an impl Iterator
:
pub fn captures<'h>(&self, haystack: &'h str) -> Option<impl Iterator<&'h str>>
The iterator types for regex
and regress
are different, but we often use the either
crate to work around this:
let group_result = | ||
std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { | ||
m.group(group_index) | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We absolutely cannot do this:
- Panic handling in Rust is very slow, and should only be used for codepaths you expect to never be hit. This is especially true for library code (which this is).
- Panics (even if caught) trigger the global panic hooks, which we use in Next.js for things like cache invalidation and anonymized telemetry.
It looks like regress::Match
has a groups
function that returns an iterator: https://docs.rs/regress/latest/regress/struct.Match.html#method.groups
Could you use that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see inline comments
EsRegex support captures method, in utoo situation, we need Esregexp to support captures the match string from user's regex config item.