-
Notifications
You must be signed in to change notification settings - Fork 3.7k
feat: refactors kafka/client to improve metric collection #17308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Updates the `pkg/kafka/*.go` code base to refine how we're initializing Kafka clients and configuring metrics. This set of changes removes functionality for calling services to inject `kprom.Metrics` given all services are relying on the base set. In doing so, this set of changes improves the metric names so they represent the following: * The metrics are collected by the Kafka client (`loki_kafka_client_`) * The metrics are collected by a reader/writer (label: "service_role: reader|writer") * The metrics are collected by a particular component (label: "component: ...") This set of changes also adds a few metrics to improve our coverage, and it alters one metric description to help better explain what we're collecting. In the future, we can easily extend this to allow services to inject customized `kprom.Metrics` structs if needed.
I just realized the tests in kafka/ use the standard Prometheus registry. Changing from NewPedanticRegistry() to NewRegistry() for consistency's sake
Fixed variable name and removed unneeded boolean
Minor change to a few metrics. Rather than naming them 'ingest_storage_reader', this changes them to 'partition_reader' to make it clear what they are acting on. This way we don't tie the metric name to ingestion or storage.
…oki into agoss/kafka-metric-adjustments
The distributor code was missing additional identification for the service and its role. This commit adds an additional function to the kafka writer to wrap a prometheus registerer with the appropriate labels. These labels are applied to the code the distributor uses to create a new producer client. Also added two constants for the `service_role` label.
To make things consistent, updating the partition reader metrics in kafka so they have the same prefix as other kafka client metrics. This helps us quickly identify where these metrics are coming from.
aarogoss
commented
Apr 18, 2025
enableKafkaHistogramMetrics(enableKafkaHistograms), | ||
) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I relocated this and enableKafkaHistogramMetrics
to the writer_client.go file to be consistent w/the reader_client.go's use of functions (e.g. commonKafkaClientOptions
) in the writer_client.go file.
Minor change: don't use pedantic registry -- just use NewRegistry() in client_test.go
…oki into agoss/kafka-metric-adjustments
Thinking about this over the weekend, I don't know that this label provides us any real additional insight into the metrics/services which are using the Kafka client. Removing it -- it can always be reintroduced if we need additional insight.
…oki into agoss/kafka-metric-adjustments
…oki into agoss/kafka-metric-adjustments
owen-d
approved these changes
Apr 21, 2025
This was referenced Jun 24, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
operational
Used internally by Grafana Labs to help organize, added to tasks we put in our on-call workqueue.
size/L
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
The
pkg/kafka
code base has some inconsistencies in how prometheus metrics are identified, described, and initialized. Reader and writer clients initializekprom.Metrics
structs in different ways, even though they ultimately build the same basic struct.This change refactors the
pkg/kafka
module to refine how we're initializing Kafka clients and configuring metrics. This set of changes removes functionality for calling services to injectkprom.Metrics
given all services are relying on the base set, and requires services which use the clients to specify a component string. This string is included as a label in the metrics.In doing so, this set of changes improves the metric names so they represent the following:
loki_kafka_client_
)This set of changes also adds a few metrics to improve our coverage, and alters one metric description to help better explain what we're collecting.
In the future, we can easily extend this to allow services to inject customized
kprom.Metrics
structs if desired.Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
deprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR