[ML] Adjust jinaai rerank response parser to handle document field as string or object#136751
Conversation
|
Hi @jonathan-buttner, I've created a changelog YAML for you. |
|
Pinging @elastic/ml-core (Team:ML) |
DonalEvans
left a comment
There was a problem hiding this comment.
Just some optional formatting changes.
| @@ -77,80 +74,78 @@ public class JinaAIRerankResponseEntity { | |||
| * @throws IOException if there is an error parsing the response | |||
| */ | |||
| public static InferenceServiceResults fromResponse(HttpResult response) throws IOException { | |||
There was a problem hiding this comment.
Would it be worth updating the javadoc for this method to include the new behaviour? If you do decide to update it, putting <pre> tags around the JSON parts would be helpful, to make the javadoc more readable in the IDE. Right now, when mousing over the method name to show the javadoc, the JSON is formatted all on one line, which is very difficult to read.
| private record Response(List<ResultItem> results) { | ||
| @SuppressWarnings("unchecked") | ||
| public static final ConstructingObjectParser<Response, Void> PARSER = new ConstructingObjectParser<>( | ||
| Response.class.getSimpleName(), |
There was a problem hiding this comment.
Using getSimpleName() here will result in any errors during parsing referencing Response which is pretty vague if we need to use it for debugging. Using any of the other "name" methods would be too verbose, but is there some way we could include the name of the parent class? Or is that overkill?
There was a problem hiding this comment.
Hmm, typically we just do the getSimpleName(). I think the error will have the stacktrace though which should get us to the code that is failing 🤔
| } else { | ||
| parser.nextToken(); | ||
| } | ||
| private record ResultItem(int index, float relevance_score, @Nullable Document document) { |
There was a problem hiding this comment.
Nitpick, but relevance_score should probably be relevanceScore for code style consistency.
| private final String WASHINGTON_TEXT = "Washington, D.C.."; | ||
| private final String CAPITAL_PUNISHMENT_TEXT = | ||
| "Capital punishment has existed in the United States since before the United States was a country. "; | ||
| private final String CARSON_CITY_TEXT = "Carson City is the capital city of the American state of Nevada."; | ||
|
|
||
| private final List<RankedDocsResults.RankedDoc> responseLiteralDocsWithText = List.of( | ||
| new RankedDocsResults.RankedDoc(2, 0.98005307F, WASHINGTON_TEXT), | ||
| new RankedDocsResults.RankedDoc(3, 0.27904198F, CAPITAL_PUNISHMENT_TEXT), | ||
| new RankedDocsResults.RankedDoc(0, 0.10194652F, CARSON_CITY_TEXT) | ||
| ); |
There was a problem hiding this comment.
Could these constants be moved to the top of the class and made static? Also, if you do make them all static, for style consistency, responseLiteralDocsWithText should be in all caps.
… string or object (elastic#136751) * Fixing rerank response parser * Update docs/changelog/136751.yaml * [CI] Auto commit changes from spotless * Addressing feedback * Updating comment about response format --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
… string or object (elastic#136751) * Fixing rerank response parser * Update docs/changelog/136751.yaml * [CI] Auto commit changes from spotless * Addressing feedback * Updating comment about response format --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
… string or object (#136751) (#136762) * Fixing rerank response parser * Update docs/changelog/136751.yaml * [CI] Auto commit changes from spotless * Addressing feedback * Updating comment about response format --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
… string or object (#136751) (#136763) * Fixing rerank response parser * Update docs/changelog/136751.yaml * [CI] Auto commit changes from spotless * Addressing feedback * Updating comment about response format --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
… string or object (elastic#136751) * Fixing rerank response parser * Update docs/changelog/136751.yaml * [CI] Auto commit changes from spotless * Addressing feedback * Updating comment about response format --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> (cherry picked from commit a1d7b8a)
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
… string or object (#136751) (#136765) * Fixing rerank response parser * Update docs/changelog/136751.yaml * [CI] Auto commit changes from spotless * Addressing feedback * Updating comment about response format --------- (cherry picked from commit a1d7b8a) Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
This PR adds functionality to handle the
documentfield as either an object or a string.Originally we were only accepted the response format as:
Now we also support
documentas a string: