Avoid stack overflow in IndicesClusterStateService applyClusterState#132536
Avoid stack overflow in IndicesClusterStateService applyClusterState#132536fcofdez merged 16 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
|
Hi @albertzaharovits, I've created a changelog YAML for you. |
|
Honestly, I think I prefer that every chained listener be executed on a generic thread, for code simplicity's sake. |
DaveCTurner
left a comment
There was a problem hiding this comment.
I'd rather we didn't extend the chain in the (overwhelmingly common) case where the cluster state update doesn't close any more shards.
Also can you cover this in a test?
| lastClusterStateShardsClosedListener = new SubscribableListener<>(); | ||
| currentClusterStateShardsClosedListeners = new RefCountingListener(lastClusterStateShardsClosedListener); | ||
| try { | ||
| previousShardsClosedListener.addListener(currentClusterStateShardsClosedListeners.acquire()); |
There was a problem hiding this comment.
Hmm are you sure we should move all this listener stuff below doApplyClusterState()?
There was a problem hiding this comment.
I can't think of any impact to execution.
There was a problem hiding this comment.
But I've put it back at the original place.
Pushed 3a00599 |
f4c5977 to
b6d6742
Compare
|
@DaveCTurner can you take another look please? I've changed the code to avoid linking listeners when the applied cluster state doesn't close any shards. |
|
@elasticmachine update branch |
|
@elasticmachine test this |
|
@elasticmachine update branch |
…lastic#132536) Every cluster state applied in the IndicesClusterStateService has the potential to chain a new RefCountingListener to a chain of such listeners. If the chain is too long, the unlucky thread that decreases the ref count to 0 for the head of the listeners chain, ends up calling each listener in turn, and, assuming all ref counts are hence decreased to 0, traversing the whole chain on its thread stack, possibly resulting in a Stackoverflow exception. This fix chains max 8 RefCountingListener, the 11th one is forked on a generic thread when it gets to execution.
…rState (#139499) * Avoid stack overflow in IndicesClusterStateService applyClusterState (#132536) Every cluster state applied in the IndicesClusterStateService has the potential to chain a new RefCountingListener to a chain of such listeners. If the chain is too long, the unlucky thread that decreases the ref count to 0 for the head of the listeners chain, ends up calling each listener in turn, and, assuming all ref counts are hence decreased to 0, traversing the whole chain on its thread stack, possibly resulting in a Stackoverflow exception. This fix chains max 8 RefCountingListener, the 11th one is forked on a generic thread when it gets to execution. * MockTransportService.createNewService
…State (#139498) * Avoid stack overflow in IndicesClusterStateService applyClusterState (#132536) Every cluster state applied in the IndicesClusterStateService has the potential to chain a new RefCountingListener to a chain of such listeners. If the chain is too long, the unlucky thread that decreases the ref count to 0 for the head of the listeners chain, ends up calling each listener in turn, and, assuming all ref counts are hence decreased to 0, traversing the whole chain on its thread stack, possibly resulting in a Stackoverflow exception. This fix chains max 8 RefCountingListener, the 11th one is forked on a generic thread when it gets to execution. * MockTransportService.createNewService
Every cluster state applied in the
IndicesClusterStateServicehas the potential to chain a newRefCountingListenerto a chain of such listeners. If the chain is too long, the unlucky thread that decreases the ref count to0for the head of the listeners chain, ends up calling each listener in turn, and, assuming all ref counts are hence decreased to0, traversing the whole chain on its thread stack, possibly resulting in aStackoverflowexception.This fix chains max 8
RefCountingListener, the 11th one is forked on a generic thread when it gets to execution.