-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
Describe the enhancement:
In some cases of high load, Elasticsearch will return 429 errors to indicate rate limiting.
Beats should back off if it detects a HTTP 429 response.
Looking at the code in here, it looks like it does not do that and just re-sends per code
// L500
if itemStatus == http.StatusTooManyRequests {
stats.fails++
stats.tooMany++
return true
}
// L553
func (stats bulkResultStats) reportToObserver(ob outputs.Observer) {
ob.AckedEvents(stats.acked)
ob.RetryableErrors(stats.fails) // 👈 retries no back off
ob.PermanentErrors(stats.nonIndexable)
ob.DuplicateEvents(stats.duplicates)
ob.DeadLetterEvents(stats.deadLetter)
ob.ErrTooMany(stats.tooMany)
}Describe a specific use case for the enhancement or feature:
I have an Elasticsearch cluster, and in my network I have deployed 1500 elastic-agents.
They are sending lots of logs to Elasticsearch, and I am routinely getting HTTP 429 errors.
On one hand, I am trying to scale the resources on the Elasticsearch server side.
However, it would be good if beats and elastic-agent could backoff if it detects HTTP 429 errors. Right now beats and elastic-agent seem to keep hammering on Elasticsearch if it returns HTTP 429 errors.