Fix nested aggregation top_hits with query inner_hits#137351
Fix nested aggregation top_hits with query inner_hits#137351elasticsearchmachine merged 18 commits intoelastic:mainfrom
Conversation
Top_hits aggregations inside nested aggregations were inheriting the parent query's inner_hits context, causing conflicting fetch scopes and nested source extraction to fail with "Error retrieving path". When top_hits is a child of NestedAggregator, now creates an empty InnerHitsContext to prevent InnerHitsPhase from running. Closes elastic#136893
|
Hi @mayya-sharipova, I've created a changelog YAML for you. |
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
@mayya-sharipova maybe I misunderstand how this will look for users. But as a user, I would still expect to get inner hits for both whenever I request them. One that is present within |
|
@benwtrent Thanks for your comment. This fix applies only when top_hits agg is a part of nested. When we do
|
|
@benwtrent I've checked an it looks like inner_hits for double nested fields in top_hits doesn't work (for most docs doesn't return anything). PUT comments
{
"mappings": {
"properties": {
"product": {
"type": "keyword"
},
"comments": {
"type": "nested",
"properties": {
"comment": {
"type": "text"
},
"user": {
"type" : "keyword"
},
"ratings": {
"type": "nested",
"properties": {
"rating": {
"type": "integer"
},
"user": {
"type": "keyword"
}
}
}
}
}
}
}
}
POST /comments/_search?_source=false
{
"aggs": {
"nested_ratings": {
"nested": {
"path": "comments"
},
"aggs": {
"ratings_top_hits": {
"top_hits": {
"size": 10
}
}
}
}
},
"query": {
"nested": {
"inner_hits": {},
"path": "comments",
"query": {
"match_all": {}
}
}
}
} |
19ad3ad to
62b4750
Compare
|
@benwtrent I want to review this PR, and appreciate your follow up review. Here is the summary: Currently this request fails: POST test/_search
{
"query": {
"nested": {
"inner_hits": {}, // ← Operates at parent doc scope
"path": "nested_item",
"query": { "match_all": {} }
}
},
"aggregations": {
"nested_item": {
"nested": { "path": "nested_item" }, // ← Already in nested scope
"aggregations": {
"01": {
"top_hits": {} // ← Tries to inherit inner_hits, causing conflict
}
}
}
}
}Why it fails:
The error: The system tries to extract nested documents by offset, but the offset calculation is wrong because it's applying nested logic to documents that are already in nested context. This PR addresses by preventing In our last discussion we wanted to check for second level of nestedness (If the nested field has another level of nested fields inside it, then inner_hits could be useful in top_hits). I did a check, and it is currently NOT supported in Elasticsearch : double-nestedness inner_hits don't work in |
benwtrent
left a comment
There was a problem hiding this comment.
I think this is good. We obviously cannot allow multi-layered of nested ness.
Currently `top_hits` aggs inside nested aggs inherit the parent query's inner_hits context, causing an error "Error retrieving path". This happenes because nested top_hits is already operating on nested child docs, and asking it to retrieve inner_hits adds an extra incorrect level of nestedness. This fix proposes to create an empty InnerHitsContext for this case. Closes elastic#136893
Currently
top_hitsaggs inside nested aggs inherit the parentquery's inner_hits context, causing an error "Error retrieving path".
This happenes because nested top_hits is already operating on nested child docs,
and asking it to retrieve inner_hits adds an extra incorrect level of nestedness.
This fix proposes to create an empty InnerHitsContext for this case.
Closes #136893