Improve scalability of get-license action#134457
Conversation
Today this action runs on the transport worker thread and forwards the request on to the master by default. It turns out that Elastic Agent uses this API as a readiness check whenever opening a new connection, so a thundering herd of 1000s of agents can prevent the transport worker threads from doing more useful work for far too long, leading to high latency and timeouts. This commit changes the default behaviour to run the action on the local node rather than forwarding to the master (although the option remains to specify `?local=false`) and dispatches the work off of the transport worker early.
|
Pinging @elastic/es-security (Team:Security) |
|
Hi @DaveCTurner, I've created a changelog YAML for you. |
The default for the `?local` parameter to the `GET _license` API changed from `false` to `true` in elastic/elasticsearch#134457. This commit adjusts the documentation to match.
ywangd
left a comment
There was a problem hiding this comment.
LGTM
I think it means a default GetLicense request now forks twice: First in its REST handler and second time in TransportMasterNodeAction. I think it's fine. Just wanted to call it out explicitly.
Yes you're right; I'd forgotten about that second dispatch in It concerns me slightly that any non-local |
* Update get-license default for ?local The default for the `?local` parameter to the `GET _license` API changed from `false` to `true` in elastic/elasticsearch#134457. This commit adjusts the documentation to match. * Explain false
Today this action runs on the transport worker thread and forwards the
request on to the master by default. It turns out that Elastic Agent
uses this API as a readiness check whenever opening a new connection, so
a thundering herd of 1000s of agents can prevent the transport worker
threads from doing more useful work for far too long, leading to high
latency and timeouts.
This commit changes the default behaviour to run the action on the local
node rather than forwarding to the master (although the option remains
to specify
?local=false) and dispatches the work off of the transportworker early.