-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Description
Elasticsearch Version
8.18.0
Installed Plugins
No response
Java Version
bundled
OS Version
N/A
Problem Description
When using the _search endpoint, timeouts during the fetch phase with allow_partial_search_results=true result in an java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0.
If a shard times out during the fetch phase there are two code paths based on the allow_partial_search_results value.
If allow_partial_search_results is set to false, the shard will throw a SearchTimeoutException that is handled by the coordinating node to fail the search.
If instead allow_partial_search_results is set to true, the shard will return an empty SearchHits[]. In this case though the coordinating node is expecting a number of SearchHits > 0. This misalignment between data structures is causing an ArrayIndexOutOfBoundsException during the final merge phase, that is when the coordinating node merges all the fetched documents from the different shards to create the final response to the user.
Probably introduced by: #116676
Steps to Reproduce
A search query with allow_partial_search_results=true with a small timeout (ms) can trigger this but is far from deterministic, by adding extra work (eg. Aggs) we have more caches to have this behaviour reproduced.
Being a timeout it also really depends on how fast the hardware is and other external factors.
GET index_name/_search?allow_partial_search_results=true
{
"timeout": "10ms",
"query": {
"match_all": {}
},
"aggs": {
"code": {
"terms": {
"field": "code"
}
}
}
}
Logs (if relevant)
Failed to execute phase [fetch],
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:723)
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:103)
at org.elasticsearch.server@8.18.0/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:29)
at org.elasticsearch.server@8.18.0/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:34)
at org.elasticsearch.server@8.18.0/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1044)
at org.elasticsearch.server@8.18.0/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1095)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:619)
at java.base/java.lang.Thread.run(Thread.java:1447)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.SearchPhaseController.getHits(SearchPhaseController.java:463)
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.SearchPhaseController.merge(SearchPhaseController.java:376)
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.FetchSearchPhase.lambda$moveToNextPhase$4(FetchSearchPhase.java:279)
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:421)
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.FetchSearchPhase.moveToNextPhase(FetchSearchPhase.java:278)
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:122)
at org.elasticsearch.server@8.18.0/org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:98)
at org.elasticsearch.server@8.18.0/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
... 6 more