-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Description
Because MV_EXPAND generally cannot be pushed down past a SORT, it can cause situations where a SORT/OrderBy node cannot be combined with a downstream LIMIT node. When we cannot combine an OrderBy with a Limit, it gets mapped to a OrderExec - a physical plan node that we cannot map to an operator because we don't implement unbounded sorts (sorts without a built-in limit, i.e. TopNs).
In some cases, queries with MV_EXPAND and SORT just fail with the cryptic message unknown physical plan node [OrderExec]. In other cases, the query is optimized to make it run somehow, but I believe this is not actually correct for all cases.
I think this requires:
- A better error message. In fact, I think we shouldn't ever map to
OrderExec. We probably should implementPostOptimizationVerificationAwareon theOrderByclass and have the verification fail if a node is even present after optimization, ideally with a suggestion on what a user can try to do. - Double checking the rule
AddDefaultTopNfor correctness.
In the following examples, consider a simple index test with fields field1 and field2.
Example 1
from test* | sort field1 | mv_expand field1 | sort field2 | limit 1
This should be impossible to run, but the query does not fail. That's because AddDefaultTopN turns the OrderBy into a TopN with limit 10000:
[2025-01-24T15:33:30,464][TRACE][o.e.x.e.o.LogicalPlanOptimizer] [runTask-0] Rule logical.AddDefaultTopN applied
TopN[[Order[field2{f}#26,ASC,LAST]],2[INTEGER]] = TopN[[Order[field2{f}#26,ASC,LAST]],2[INTEGER]]
\_MvExpand[field1{f}#25,field1{r}#27,null] = \_MvExpand[field1{f}#25,field1{r}#27,null]
\_OrderBy[[Order[field1{f}#25,ASC,LAST]]] ! \_TopN[[Order[field1{f}#25,ASC,LAST]],10000[INTEGER]]
\_EsRelation[test*][field1{f}#25, field2{f}#26] = \_EsRelation[test*][field1{f}#25, field2{f}#26]
This is simply wrong in cases where the top 10k documents by field1 do not include the single top 1 document by field2. However, it would be correct to eliminate the sort field1 altogether, because after the mv_expand field1 we sort again, anyway. It'd also be correct to exchange the positions of the mv_expand field1 with sort field2 because sorting by a non-expanded field is unaffected by the expansion. That means the query is equivalent to
from test* | sort field1 | sort field2 | mv_expand field1| limit 1
where the two sorts next to each other would be pruned by PushDownAndCombineOrderBy. This would end up performing the TopN operation before the mv_expand, allowing us to push this down to Lucene and making the query fast.
Example 2
from test* | sort field1 | mv_expand field2 | sort field2 | limit 1
Again, AddDefaultTopN wrongly discards anything other than the top 10k documents by field1:
[2025-01-24T15:40:45,503][TRACE][o.e.x.e.o.LogicalPlanOptimizer] [runTask-0] Rule logical.AddDefaultTopN applied
TopN[[Order[field2{r}#35,ASC,LAST]],2[INTEGER]] = TopN[[Order[field2{r}#35,ASC,LAST]],2[INTEGER]]
\_MvExpand[field2{f}#34,field2{r}#35,null] = \_MvExpand[field2{f}#34,field2{r}#35,null]
\_OrderBy[[Order[field1{f}#33,ASC,LAST]]] ! \_TopN[[Order[field1{f}#33,ASC,LAST]],10000[INTEGER]]
\_EsRelation[test*][field1{f}#33, field2{f}#34] = \_EsRelation[test*][field1{f}#33, field2{f}#34]
In this case, it'd be correct to push down mv_expand field2 past sort field1. If we did that, we'd have two sorts next to each other, essentially as if we had written
from test* | mv_expand field2 | sort field1 | sort field2 | limit 1
and in this case, the sort field1 gets correctly eliminated:
[2025-01-24T15:45:03,381][TRACE][o.e.x.e.o.LogicalPlanOptimizer] [runTask-0] Rule logical.PushDownAndCombineOrderBy applied
Limit[1[INTEGER]] = Limit[1[INTEGER]]
\_OrderBy[[Order[field2{r}#43,ASC,LAST]]] = \_OrderBy[[Order[field2{r}#43,ASC,LAST]]]
\_OrderBy[[Order[field1{f}#41,ASC,LAST]]] ! \_MvExpand[field2{f}#42,field2{r}#43,null]
\_MvExpand[field2{f}#42,field2{r}#43,null] ! \_EsRelation[test*][field1{f}#41, field2{f}#42]
\_EsRelation[test*][field1{f}#41, field2{f}#42] !
Example 3
from test* | sort field1 | mv_expand field1 | sort field1 | limit 1
It'd be correct to eliminate the first sort field1. It'd be incorrect per se to exchange the mv_expand field1 with any of the sorts left or right of it to achieve that.
Example 4
from test* | sort field1 | mv_expand field2 | sort field1 | limit 1
It'd be correct to eliminate the first or second sort field1. It'd be correct per se to exchange the mv_expand field1 with any of the sorts left or right of it to achieve that. But eliminating/pushgin down the rightmost sort field1 would lead to a potentially more optimial plan due to being able to push down a Top1 to Lucene.
Example 5
from test* | limit 10000 | sort field1 | mv_expand field2 | sort field2 | limit 2
In this case, there's not even AddDefaultTopN jumping in to the rescue (even though it did so wrongly in Example 2):
"error":{"root_cause":[{"type":"esql_illegal_argument_exception","reason":"unknown physical plan node [OrderExec]"}],"type":"esql_illegal_argument_exception","reason":"unknown physical plan node [OrderExec]"},"status":500}
Similarly to Example 2, the first sort field1 is actually fine to eliminate.
Example 6
from test* | limit 10000 | sort field1 | mv_expand field1 | sort field2 | limit 2
The same as in Example 5, we get an illegal argument exception with a 500. But again, the sort field1 is redundant.