Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
I looked at the docs for Postgresql and a few others and it looks like the function called ASCII returns an integer that is the ascii code for the first character. I'm fine having a function that escapes non-ascii characters though. Is there a better name for it? |
Thank you @nik9000 !
Following the convention of other functions in the catalog, like |
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Ascii.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Ascii.java
Outdated
Show resolved
Hide resolved
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Ascii.java
Outdated
Show resolved
Hide resolved
| formatStr = "\\\\U%08x"; | ||
| } | ||
|
|
||
| resultStr = Strings.format(formatStr, code); |
There was a problem hiding this comment.
I believe format needs Locale.ROOT on the front of it. Without it the build complains.
|
|
| @@ -2796,3 +2748,14 @@ book_no:keyword | author_encoded:keyword | title_encoded:keyword | |||
| 1463 | J.%20R.%20R.%20Tolkien | Realms%20of%20Tolkien%3A%20Images%20of%20Middle-earth | |||
| ; | |||
|
|
|||
| ascii | |||
| required_capability: ascii | |||
There was a problem hiding this comment.
Could you name the capability fn_to_ascii instead please?
| required_capability: ascii | ||
| // tag::ascii[] | ||
| ROW a = "Hello\n\t 世界! 🌍 Café naïve résumé こんにちは 🎉 中文测试 αβγδε 日本語テスト 🚀🔥💧🪨" | EVAL x = ASCII(a) | KEEP x; | ||
| // end::ascii[] |
There was a problem hiding this comment.
Because this is going to be displayed in the docs it might be nicer to format it like:
ROW s = [
"Hello world!\n\t",
"世界!",
"🌍",
"Café naïve résumé",
"こんにちは",
" 🎉",
"中文测试",
"αβγδε",
"日本語テスト"
"🚀🔥💧🪨"
]
| EVAL s = TO_ASCII(s)
Otherwise it gets really wide to fit into the screen.
There was a problem hiding this comment.
It'd be nice to also have a csv-spec test for this that uses an index. Something like:
FROM airports
| WHERE abbrev = "BTS"
| EVAL ascii_name = TO_ASCII(name)
| KEEP name, ascii_name
Maybe even:
FROM airports
| WHERE TO_ASCII(abbrev) LIKE "%Mu\u(whatever ñ is)oz%"
| KEEP name
| @ParametersFactory | ||
| public static Iterable<Object[]> parameters() { | ||
|
|
||
| List<TestCaseSupplier> cases = new ArrayList<>(); |
There was a problem hiding this comment.
The common logic in these 11 test cases could be delegated to a private method, that only needs to be passed the input, the output and the test title. Would be nicer to read.
Also there is a way to add random tests for string functions, we should probably add those here as they test for more than just input/output. An example can be found in AbstractUrlEncodeDecodeTestCase.java#L78.
...esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Ascii.java
Outdated
Show resolved
Hide resolved
…ultivalued inputs
BASE=0b984b4cf53145085c87445d39a3a5c7bc37dde5 HEAD=be5beefa2a699bff708a14b10c2a58da9bc0ebb5 Branch=main
BASE=0b984b4cf53145085c87445d39a3a5c7bc37dde5 HEAD=be5beefa2a699bff708a14b10c2a58da9bc0ebb5 Branch=main
Adds
TO_ASCIIstring function to ESQL functions.It escapes:
\n,\t...Closes: #137282