Skip to content

OTel output metrics sending invalid values for beat.stats.libbeat.output.events.active #12515

@belimawr

Description

@belimawr

When running Elastic Agent with OTel mode enabled for integrations, the output metrics are stuck at a HUGE fixed value that keeps flipping between positive and negative. It looks like a integer overflow/wrong type conversion.

Here are the two values returned by beat.stats.libbeat.output.events.active:

  • 9223372036854775000
  • -9223372036854776000

I tested/reproduced it with two recent 9.4.0-SNAPSHOT:

  • Binary: 9.4.0-SNAPSHOT (build: 204ae1e828c514cec2cc5449b666b89bcad786bd at 2026-01-26 18:18:46 +0000 UTC)
  • Binary: 9.4.0-SNAPSHOT (build: c5b7279713ddff84d9e7ddb47f90ff3e1bfa5284 at 2026-01-28 17:03:05 +0000 UTC)

How to reproduce

1. Generate some logs to be ingested

You can use Docker and flog:

docker run -d --rm mingrammer/flog -d 0.1 -s 0.1 -l > /tmp/flog.log

2. Create an Elastic Agent policy

Add the default system integration and a filestream integration, for the filestream integration, configure it to read /tmp/flog.log.

Once the policy is created, go to the policy, then Settings -> Advanced settings -> Advanced internal YAML settings and add the following

agent:
  internal:
    runtime:
      filebeat:
        filestream: otel
      metricbeat:
        default: process
        system/metrics: otel
agent.monitoring._runtime_experimental: otel

OBS:
As the output for integrations I used mock-es: mock-es -toomany 5 -toolarge 10 -nonindex 15 -dup 20 -delay 1.5s -metrics 1s, however I don't think the output errors have anything to do with the queue metrics.

3. Deploy the Elastic Agent and let it run for a whille

Go to the "[Elastic Agent] Agent metrics" dashobard, select your host (make sure the dashboard is showing data for a single Elastic Agent), then scroll down to the "[Elastic Agent] Output batch size" widget, you should see something like that:

Image

Hover the widget top right corner, then select "Explore in Discover" (see image above). There you can see the metrics directly.

To make it easier, add a filter: beat.stats.libbeat.output.events.active: *

Example

Here is a dump from the metrics index with the invalid metric:
Metrics example: otel-bug-metrics.tar.gz

You can extract, the filter filter it using jq:

tar -xf otel-bug-metrics.tar.gz 
cat otel-output-bug.ndjson |jq -S '._source.beat.stats.libbeat.output.events.active'

Optionally you can upload it to Elasticsearch so you can also see the data in the dashboard (adjust the credentials as needed):

docker run --rm -ti \
  --network=host \
  -v $(pwd):/data \
  -e NODE_TLS_REJECT_UNAUTHORIZED=0 \
  elasticdump/elasticsearch-dump \
  --input=/data/otel-output-bug.ndjson \
  --output=https://localhost:9200 \
  --headers='{"Authorization":"Basic '$(echo -n elastic:changeme | base64)'"}' \
  --type=data \
  --bulkAction=create

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions