Skip to content

Conversation

@alecthomas
Copy link
Owner

The <analyse> element contains a regex to match against the input, and a score if the pattern matches.

The scores of all matching patterns for a lexer are summed.

Replaces #815, #813 and #826.

The `<analyse>` element contains a regex to match against the input, and
a score if the pattern matches.

The scores of all matching patterns for a lexer are summed.

Replaces #815, #813 and #826.
@alecthomas alecthomas merged commit a20cd7e into master Aug 21, 2023
@alecthomas alecthomas deleted the aat/xml-analyse branch August 21, 2023 19:32
@gandarez
Copy link
Contributor

Why scores are added instead of early returned? It won't work unless you implement a control flow and the developer decides if it needs to be added or not.

For example C# Aspx

if csharpAspxAnalyzerPageLanguageRe.MatchString(text) {
	return 0.2
}

if csharpAspxAnalyzerScriptLanguageRe.MatchString(text) {
	return 0.15
}

return 0
@gandarez
Copy link
Contributor

Pygments ex1 and ex2

@alecthomas
Copy link
Owner Author

Summing the scores seems like a generally more useful approach to me. For the two PRs you've sent it makes no sense to early exit. If a file has both #include < and using namespace, those are strong signals.

If you'd like different behaviour, send a PR. I could see an extra attribute like single="true" being useful.

@gandarez
Copy link
Contributor

I think would be better if the attribute lives in the root node of analyse. So my suggestion is to change like this. What do you think?

<config>
  <analyse single="true">
    <regex value="(?m)^\s*#include &lt;" score="0.1">
    <regex value="(?m)^\s*#ifn?def " score="0.1">
  </analyse>
</config>
@alecthomas
Copy link
Owner Author

Yeah great idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants