-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Description
Elasticsearch Version
8.14
Installed Plugins
No response
Java Version
bundled
OS Version
any
Problem Description
ML trained models can be deployed in a low priority mode which should be limited to 1 allocation and 1 thread per allocation. When checking there is sufficient CPU resource to deploy a model the assignment planner allows multiple low priority deployments to share a single CPU allowing low priority deployments be over allocated.
The problem is that the number of allocations can be updated to a much higher number and the because the assignment planner treats low priority deployments differently it will not consider those extra allocations when calculating the available resource. Multiple low priority deployments can be created then updated to use far more CPU than is available in the cluster. In cloud low priority deployments do not trigger a scale event so the cluster will not grow in size and can become extremely over allocated.
Steps to Reproduce
- Start a deployment in low priority mode
- Update the number of allocations to a high number
- Optionally start more low priority deployments and increase the number of allocations. In the screen shot below the total required CPU cores for all deployments far exceeds what is available on the system.
Logs (if relevant)
No response