Releases: grafana/tempo
v1.5.0-rc.0
Breaking Changes
- (#1478) In order to build advanced visualization features into Grafana we have decided to change our spanmetric names to match OTel conventions. This way any functionality added to Grafana will work whether you use Tempo, Grafana Agent or the OTel Collector to generate metrics. Details below.
- (#1556) Jsonnet users will need to specify ephemeral storage requests and limits for the metrics generator.
- (#1481) Anonymous usage reporting has been added. Distributors and metrics generators will now require permissions to object storage equivalent to compactors and ingesters. This feature is enabled by default but can be disabled easily.
- (#1558) Deprecated metrics
tempodb_(gcs|s3|azure)_request_duration_secondshave been removed in favor oftempodb_backend_request_duration_seconds.
Changes
- [CHANGE] metrics-generator: Changed added metric label
instanceto__metrics_gen_instanceto reduce collisions with custom dimensions. #1439 (@joe-elliott) - [CHANGE] Don't enforce
max_bytes_per_tag_values_querywhen set to 0. #1447 (@joe-elliott) - [CHANGE] Add new querier service in deployment jsonnet to serve
/statusendpoint. #1474 (@annanay25) - [CHANGE] Swapped out Google Cloud Functions serverless docs and build for Google Cloud Run. #1483 (@joe-elliott)
- [CHANGE] BREAKING CHANGE Change spanmetrics metric names and labels to match OTel conventions. #1478 (@mapno)
Old metric names:
traces_spanmetrics_duration_seconds_{sum,count,bucket}
New metric names:
traces_spanmetrics_latency_{sum,count,bucket}
Additionally, default label span_status is renamed to status_code.
- [CHANGE] Update to Go 1.18 #1504 (@annanay25)
- [CHANGE] Change tag/value lookups to return partial results when reaching response size limit instead of failing #1517 (@mdisibio)
- [CHANGE] Change search to be case-sensitive #1547 (@mdisibio)
- [CHANGE] Relax Hedged request defaults for external endpoints. #1566 (@joe-elliott)
querier: search: external_hedge_requests_at: 4s -> 8s external_hedge_requests_up_to: 3 -> 2 - [CHANGE] BREAKING CHANGE Include emptyDir for metrics generator wal storage in jsonnet #1556 (@zalegrala)
Jsonnet users will now need to specify a storage request and limit for the generator wal.
_config+:: {
metrics_generator+: {
ephemeral_storage_request_size: '10Gi',
ephemeral_storage_limit_size: '11Gi',
},
}
- [CHANGE] Two additional latency buckets added to the default settings for generated spanmetrics. Note that this will increase cardinality when using the defaults. #1593 (@fredr)
- [CHANGE] Mark
log_received_tracesas deprecated. New flag islog_received_spans.
Extend distributor spans logger with optional features to include span attributes and a filter by error status. #1465 (@faustodavid)
Features
- [FEATURE] Add parquet block format #1479 #1531 #1564 (@annanay25, @mdisibio)
- [FEATURE] Add anonymous usage reporting, enabled by default. #1481 (@zalegrala)
BREAKING CHANGE As part of the usage stats inclusion, the distributor will also require access to the store. This is required so the distirbutor can know which cluster it should be reporting membership of. - [FEATURE] Include messaging systems and databases in service graphs. #1576 (@kvrhdn)
Enhancements
- [ENHANCEMENT] Added the ability to have a per tenant max search duration. #1421 (@joe-elliott)
- [ENHANCEMENT] metrics-generator: expose max_active_series as a metric #1471 (@kvrhdn)
- [ENHANCEMENT] Azure Backend: Add support for authentication with Managed Identities. #1457 (@joe-elliott)
- [ENHANCEMENT] Add metric to track feature enablement #1459 (@zalegrala)
- [ENHANCEMENT] Added s3 config option
insecure_skip_verify#1470 (@zalegrala) - [ENHANCEMENT] Added polling option to reduce issues in Azure
blocklist_poll_jitter_ms#1518 (@joe-elliott) - [ENHANCEMENT] Add a config to query single ingester instance based on trace id hash for Trace By ID API. (1484)[https://github.com//pull/1484] (@sagarwala, @bikashmishra100, @ashwinidulams)
- [ENHANCEMENT] Add blocklist metrics for total backend objects and total backend bytes #1519 (@ie-pham)
- [ENHANCEMENT] Adds
tempo_querier_external_endpoint_hedged_roundtrips_totalto count the total hedged requests #1558 (@joe-elliott)
BREAKING CHANGE Removed deprecated metricstempodb_(gcs|s3|azure)_request_duration_secondsin favor oftempodb_backend_request_duration_seconds. These metrics
have been deprecated since v1.1. - [ENHANCEMENT] Add tags option for s3 backends. This allows new objects to be written with the configured tags. #1442 (@stevenbrookes)
- [ENHANCEMENT] metrics-generator: support per-tenant processor configuration #1434 (@kvrhdn)
- [ENHANCEMENT] Include rollout dashboard #1456 (@zalegrala)
- [ENHANCEMENT] Add SentinelPassword configuration for Redis #1463 (@zalegrala)
Bugfixes
- [BUGFIX] Fix nil pointer panic when the trace by id path errors. #1441 (@joe-elliott)
- [BUGFIX] Update tempo microservices Helm values example which missed the 'enabled' key for thriftHttp. #1472 (@hajowieland)
- [BUGFIX] Fix race condition in forwarder overrides loop. 1468 (@mapno)
- [BUGFIX] Fix v2 backend check on span name to be substring #1538 (@mdisibio)
- [BUGFIX] Fix wal check on span name to be substring #1548 (@mdisibio)
- [BUGFIX] Prevent ingester panic "cannot grow buffer" #1258 (@mdisibio)
- [BUGFIX] metrics-generator: do not remove x-scope-orgid header in single tenant modus #1554 (@kvrhdn)
- [BUGFIX] Fixed issue where backend does not support
root.nameandroot.service.name#1589 (@kvrhdn) - [BUGFIX] Fixed ingester to continue starting up after block replay error #1603 (@mdisibio)
v1.4.1
Bugfixes
- [BUGFIX] metrics-generator: don't inject X-Scope-OrgID header for single-tenant setups 1417 (@kvrhdn)
- [BUGFIX] compactor: populate
compaction_objects_combined_totalandtempo_discarded_spans_total{reason="trace_too_large_to_compact"}metrics again 1420 (@mdisibio) - [BUGFIX] distributor: prevent panics when concurrently calling
shutdownto forwarder's queueManager 1422 (@mapno)
v1.4.0
Breaking changes
- After this rollout the distributors will use a new API endpoint on the ingesters to push spans. Please rollout all ingesters before rolling the
distributors to prevent downtime. Also, during this period, the ingesters will use considerably more resources and should be scaled up (or
incoming traffic should be heavily throttled). Once all distributors and ingesters have rolled performance will return to normal. Internally we
have observed ~1.5x CPU load on the ingesters during the rollout. #1227 (@joe-elliott) - Querier options related to search have moved under a
searchblock: #1350 (@joe-elliott)becomesquerier: search_query_timeout: 30s search_external_endpoints: [] search_prefer_self: 2querier: search: query_timeout: 30s prefer_self: 2 external_endpoints: [] - Dropped
tempo-search-retention-durationparameter on the vulture. #1297 (@joe-elliott)
New Features and Enhancements
- [FEATURE] Added metrics-generator: an optional components to generate metrics from ingested traces #1282 (@mapno, @kvrhdn)
- [ENHANCEMENT] v2 object encoding added. This encoding adds a start/end timestamp to every record to reduce proto marshalling and increase search speed. #1227 (@joe-elliott)
- [ENHANCEMENT] Allow the compaction cycle to be configurable with a default of 30 seconds #1335 (@willdot)
- [ENHANCEMENT] Add new config options for setting GCS metadata on new objects #1368 (@zalegrala)
- [ENHANCEMENT] Add new scaling alerts to the tempo-mixin #1292 (@mapno)
- [ENHANCEMENT] Improve serverless handler error messages #1305 (@joe-elliott)
- [ENHANCEMENT] Added a configuration option
search_prefer_selfto allow the queriers to do some work while also leveraging serverless in search. #1307 (@joe-elliott) - [ENHANCEMENT] Make trace combination/compaction more efficient #1291 (@mdisibio)
- [ENHANCEMENT] Add Content-Type headers to query-frontend paths #1306 (@wperron)
- [ENHANCEMENT] Partially persist traces that exceed
max_bytes_per_traceduring compaction #1317 (@joe-elliott) - [ENHANCEMENT] Make search respect per tenant
max_bytes_per_traceand addedskippedTracesto returned search metrics. #1318 (@joe-elliott) - [ENHANCEMENT] Added tenant ID (instance ID) to
trace too large message. #1385 (@cristiangsp) - [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
- [ENHANCEMENT] Add hedging to queries to external endpoints. #1350 (@joe-elliott)
New config options and defaults:querier: search: external_hedge_requests_at: 5s external_hedge_requests_up_to: 3 - [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
Bug Fixes
- [BUGFIX] Correct issue where Azure "Blob Not Found" errors were sometimes not handled correctly #1390 (@joe-elliott)
- [BUGFIX] Enable compaction and retention in Tanka single-binary #1352 (@irizzant)
- [BUGFIX] Fixed issue when query-frontend doesn't log request details when request is cancelled #1136 (@adityapwr)
- [BUGFIX] Update OTLP port in examples (docker-compose & kubernetes) from legacy ports (55680/55681) to new ports (4317/4318) #1294 (@mapno)
- [BUGFIX] Fixes min/max time on blocks to be based on span times instead of ingestion time. #1314 (@joe-elliott)
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
storage: trace: wal: ingestion_time_range_slack: 2m0s - Includes a new metric to determine how often this range is exceeded:
tempo_warnings_total{reason="outside_ingestion_time_slack"}
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
- [BUGFIX] Prevent data race / ingester crash during searching by trace id by using xxhash instance as a local variable. #1387 (@bikashmishra100, @sagarwala, @ashwinidulams)
- [BUGFIX] Fix spurious "failed to mark block compacted during retention" errors #1372 (@mdisibio)
- [BUGFIX] Fix error message "Writer is closed" by resetting compression writer correctly on the error path. #1379 (@annanay25)
Other Changes
- [CHANGE] Vulture now exercises search at any point during the block retention to test full backend search. #1297 (@joe-elliott)
- [CHANGE] Updated storage.trace.pool.queue_depth default from 200->10000. #1345 (@joe-elliott)
- [CHANGE] Updated flags
-storage.trace.azure.storage-account-nameand-storage.trace.s3.access_keyto no longer to be considered as secrets #1356 (@simonswine)
v1.4.0-rc.0
Breaking changes
- After this rollout the distributors will use a new API endpoing on the ingesters to push spans. Please rollout all ingesters before rolling the
distributors to prevent downtime. Also, during this period, the ingesters will use considerably more resources and should be scaled up (or incoming traffic should be
heavily throttled). Once all distributors and ingesters have rolled performance will return to normal. Internally we have observed ~1.5x CPU load on the
ingesters during the rollout. #1227 (@joe-elliott) - Querier options related to search have moved under a
searchblock: #1350 (@joe-elliott)becomesquerier: search_query_timeout: 30s search_external_endpoints: [] search_prefer_self: 2querier: search: query_timeout: 30s prefer_self: 2 external_endpoints: [] - Dropped
tempo-search-retention-durationparameter on the vulture. #1297 (@joe-elliott)
New Features and Enhancements
- [FEATURE] Added metrics-generator: an optional components to generate metrics from ingested traces #1282 (@mapno, @kvrhdn)
- [ENHANCEMENT] v2 object encoding added. This encoding adds a start/end timestamp to every record to reduce proto marshalling and increase search speed. #1227 (@joe-elliott)
- [ENHANCEMENT] Allow the compaction cycle to be configurable with a default of 30 seconds #1335 (@willdot)
- [ENHANCEMENT] Add new config options for setting GCS metadata on new objects #1368 (@zalegrala)
- [ENHANCEMENT] Add new scaling alerts to the tempo-mixin #1292 (@mapno)
- [ENHANCEMENT] Improve serverless handler error messages #1305 (@joe-elliott)
- [ENHANCEMENT] Added a configuration option
search_prefer_selfto allow the queriers to do some work while also leveraging serverless in search. #1307 (@joe-elliott) - [ENHANCEMENT] Make trace combination/compaction more efficient #1291 (@mdisibio)
- [ENHANCEMENT] Add Content-Type headers to query-frontend paths #1306 (@wperron)
- [ENHANCEMENT] Partially persist traces that exceed
max_bytes_per_traceduring compaction #1317 (@joe-elliott) - [ENHANCEMENT] Make search respect per tenant
max_bytes_per_traceand addedskippedTracesto returned search metrics. #1318 (@joe-elliott) - [ENHANCEMENT] Added tenant ID (instance ID) to
trace too large message. #1385 (@cristiangsp) - [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
- [ENHANCEMENT] Add hedging to queries to external endpoints. #1350 (@joe-elliott)
New config options and defaults:querier: search: external_hedge_requests_at: 5s external_hedge_requests_up_to: 3
Bug Fixes
- [BUGFIX] Correct issue where Azure "Blob Not Found" errors were sometimes not handled correctly #1390 (@joe-elliott)
- [BUGFIX] Enable compaction and retention in Tanka single-binary #1352 (@irizzant)
- [BUGFIX] Fixed issue when query-frontend doesn't log request details when request is cancelled #1136 (@adityapwr)
- [BUGFIX] Update OTLP port in examples (docker-compose & kubernetes) from legacy ports (55680/55681) to new ports (4317/4318) #1294 (@mapno)
- [BUGFIX] Fixes min/max time on blocks to be based on span times instead of ingestion time. #1314 (@joe-elliott)
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
storage: trace: wal: ingestion_time_range_slack: 2m0s - Includes a new metric to determine how often this range is exceeded:
tempo_warnings_total{reason="outside_ingestion_time_slack"}
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
- [BUGFIX] Prevent data race / ingester crash during searching by trace id by using xxhash instance as a local variable. #1387 (@bikashmishra100, @sagarwala, @ashwinidulams)
- [BUGFIX] Fix spurious "failed to mark block compacted during retention" errors #1372 (@mdisibio)
- [BUGFIX] Fix error message "Writer is closed" by resetting compression writer correctly on the error path. #1379 (@annanay25)
Other Changes
- [CHANGE] Vulture now exercises search at any point during the block retention to test full backend search. #1297 (@joe-elliott)
- [CHANGE] Updated storage.trace.pool.queue_depth default from 200->10000. #1345 (@joe-elliott)
- [CHANGE] Updated flags
-storage.trace.azure.storage-account-nameand-storage.trace.s3.access_keyto no longer to be considered as secrets #1356 (@simonswine)
v1.3.2
Bug Fixes
- [BUGFIX] Fixed an issue where the query-frontend would mangle start/end time ranges on searches which included the ingesters [#1295] (@joe-elliott)
v1.3.1
v1.3.0
Breaking changes
This release updates OpenTelemetry libraries version to v0.40.0, and with that, it updates OTLP gRPC's default listening port from the legacy 55680 to the new 4317. There are two main routes to avoid downtime: configuring the receiver to listen in the old port 55680 and/or pushing traces to both ports simultaneously until the rollout is complete.
As part of adding support for full backend search, a search config parameter has had its name change from query_frontend.search.max_result_limit to query_frontend.search.default_result_limit.
- [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] BREAKING CHANGE Moved
querier.search_max_result_limitandquerier.search_default_result_limittoquery_frontend.search.max_result_limitandquery_frontend.search.default_result_limit#1174. - [CHANGE] BREAKING CHANGE Remove deprecated ingester gRPC endpoint and data encoding. The current data encoding was introduced in v1.0. If running earlier versions, first upgrade to v1.0 through v1.2 and allow time for all blocks to be switched to the "v1" data encoding. #1215 (@mdisibio)
New Features and Enhancements
- [FEATURE]: Add support for inline environments. #1184 (@irizzant)
- [FEATURE] Added support for full backend search. #1174 (@joe-elliott)
- [ENHANCEMENT] Expose
uptoparameter on hedged requests for each backend withhedge_requests_up_to. #1085](#1085) (@joe-elliott) - [ENHANCEMENT] Search: drop use of TagCache, extract tags and tag values on-demand #1068 (@kvrhdn)
- [ENHANCEMENT] Jsonnet: add
$._config.namespaceto filter by namespace in cortex metrics #1098 (@mapno) - [ENHANCEMENT] Add middleware to compress frontend HTTP responses with gzip if requested #1080 (@kvrhdn, @zalegrala)
- [ENHANCEMENT] Allow query disablement in vulture #1117 (@zalegrala)
- [ENHANCEMENT] Improve memory efficiency of compaction and block cutting. #1121 #1130 (@joe-elliott)
- [ENHANCEMENT] Include metrics for configured limit overrides and defaults: tempo_limits_overrides, tempo_limits_defaults #1089 (@zalegrala)
- [ENHANCEMENT] Add Envoy Proxy panel to
Tempo / Writesdashboard #1137 (@kvrhdn) - [ENHANCEMENT] Reduce compactionCycle to improve performance in large multitenant environments #1145 (@joe-elliott)
- [ENHANCEMENT] Added max_time_per_tenant to allow for independently configuring polling and compaction cycle. #1145 (@joe-elliott)
- [ENHANCEMENT] Add
tempodb_compaction_outstanding_blocksmetric to measure compaction load #1143 (@mapno) - [ENHANCEMENT] Update mixin to use new backend metric #1151 (@zalegrala)
- [ENHANCEMENT] Make
TempoIngesterFlushesFailingalert more actionable #1157 (@dannykopping) - [ENHANCEMENT] Switch open-telemetry/opentelemetry-collector to grafana/opentelemetry-collectorl fork, update it to 0.40.0 and add missing dependencies due to the change #1142 (@tete17)
- [ENHANCEMENT] Allow environment variables for Azure storage credentials #1147 (@zalegrala)
- [ENHANCEMENT] jsonnet: set rollingUpdate.maxSurge to 3 for distributor, frontend and queriers #1164 (@kvrhdn)
- [ENHANCEMENT] Reduce search data file sizes by optimizing contents #1165 (@mdisibio)
- [ENHANCEMENT] Add
tempo_ingester_live_tracesmetric #1170 (@mdisibio) - [ENHANCEMENT] Update compactor ring to automatically forget unhealthy entries #1178 (@mdisibio)
- [ENHANCEMENT] Added the ability to pass ISO8601 date/times for start/end date to tempo-cli query api search #1208 (@joe-elliott)
- [ENHANCEMENT] Prevent writes to large traces even after flushing to disk #1199 (@mdisibio)
Bug Fixes
- [BUGFIX] Add process name to vulture traces to work around display issues #1127 (@mdisibio)
- [BUGFIX] Fixed issue where compaction sometimes dropped spans. #1130 (@joe-elliott)
- [BUGFIX] Ensure that the admin client jsonnet has correct S3 bucket property. (@hedss)
- [BUGFIX] Publish tenant index age correctly for tenant index writers. #1146 (@joe-elliott)
- [BUGFIX] Ingester startup panic
slice bounds out of range#1195 (@mdisibio)
Other Changes
- [CHANGE] Search: Add new per-tenant limit
max_bytes_per_tag_values_queryto limit the size of tag-values response. #1068 (@annanay25) - [CHANGE] Reduce MaxSearchBytesPerTrace
ingester.max-search-bytes-per-tracedefault to 5KB #1129 @annanay25 - [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] Remove deprecated method
Pushfromtempopb.Pusher#1173 (@kvrhdn) - [CHANGE] Upgrade cristalhq/hedgedhttp from v0.6.0 to v0.7.0 #1159 (@cristaloleg)
- [CHANGE] Export trace id constant in api package #1176
- [CHANGE] GRPC
1.33.3=>1.38.0broke compatibility withgogoproto.customtype. Enforce the use of gogoproto marshalling/unmarshalling for Tempo, Cortex & Jaeger structs. #1186 (@annanay25)
v1.3.0-rc.0
Breaking changes
This release updates OpenTelemetry libraries version to v0.40.0, and with that, it updates OTLP gRPC's default listening port from the legacy 55680 to the new 4317. There are two main routes to avoid downtime: configuring the receiver to listen in the old port 55680 and/or pushing traces to both ports simultaneously until the rollout is complete.
As part of adding support for full backend search, a search config parameter has had its name change from query_frontend.search.max_result_limit to query_frontend.search.default_result_limit.
- [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] BREAKING CHANGE Moved
querier.search_max_result_limitandquerier.search_default_result_limittoquery_frontend.search.max_result_limitandquery_frontend.search.default_result_limit#1174.
New Features and Enhancements
- [FEATURE]: Add support for inline environments. #1184 (@irizzant)
- [FEATURE] Added support for full backend search. #1174 (@joe-elliott)
- [ENHANCEMENT] Expose
uptoparameter on hedged requests for each backend withhedge_requests_up_to. #1085](#1085) (@joe-elliott) - [ENHANCEMENT] Search: drop use of TagCache, extract tags and tag values on-demand #1068 (@kvrhdn)
- [ENHANCEMENT] Jsonnet: add
$._config.namespaceto filter by namespace in cortex metrics #1098 (@mapno) - [ENHANCEMENT] Add middleware to compress frontend HTTP responses with gzip if requested #1080 (@kvrhdn, @zalegrala)
- [ENHANCEMENT] Allow query disablement in vulture #1117 (@zalegrala)
- [ENHANCEMENT] Improve memory efficiency of compaction and block cutting. #1121 #1130 (@joe-elliott)
- [ENHANCEMENT] Include metrics for configured limit overrides and defaults: tempo_limits_overrides, tempo_limits_defaults #1089 (@zalegrala)
- [ENHANCEMENT] Add Envoy Proxy panel to
Tempo / Writesdashboard #1137 (@kvrhdn) - [ENHANCEMENT] Reduce compactionCycle to improve performance in large multitenant environments #1145 (@joe-elliott)
- [ENHANCEMENT] Added max_time_per_tenant to allow for independently configuring polling and compaction cycle. #1145 (@joe-elliott)
- [ENHANCEMENT] Add
tempodb_compaction_outstanding_blocksmetric to measure compaction load #1143 (@mapno) - [ENHANCEMENT] Update mixin to use new backend metric #1151 (@zalegrala)
- [ENHANCEMENT] Make
TempoIngesterFlushesFailingalert more actionable #1157 (@dannykopping) - [ENHANCEMENT] Switch open-telemetry/opentelemetry-collector to grafana/opentelemetry-collectorl fork, update it to 0.40.0 and add missing dependencies due to the change #1142 (@tete17)
- [ENHANCEMENT] Allow environment variables for Azure storage credentials #1147 (@zalegrala)
- [ENHANCEMENT] jsonnet: set rollingUpdate.maxSurge to 3 for distributor, frontend and queriers #1164 (@kvrhdn)
- [ENHANCEMENT] Reduce search data file sizes by optimizing contents #1165 (@mdisibio)
- [ENHANCEMENT] Add
tempo_ingester_live_tracesmetric #1170 (@mdisibio) - [ENHANCEMENT] Update compactor ring to automatically forget unhealthy entries #1178 (@mdisibio)
- [ENHANCEMENT] Added the ability to pass ISO8601 date/times for start/end date to tempo-cli query api search #1208 (@joe-elliott)
- [ENHANCEMENT] Prevent writes to large traces even after flushing to disk #1199 (@mdisibio)
Bug Fixes
- [BUGFIX] Add process name to vulture traces to work around display issues #1127 (@mdisibio)
- [BUGFIX] Fixed issue where compaction sometimes dropped spans. #1130 (@joe-elliott)
- [BUGFIX] Ensure that the admin client jsonnet has correct S3 bucket property. (@hedss)
- [BUGFIX] Publish tenant index age correctly for tenant index writers. #1146 (@joe-elliott)
- [BUGFIX] Ingester startup panic
slice bounds out of range#1195 (@mdisibio)
Other Changes
- [CHANGE] Search: Add new per-tenant limit
max_bytes_per_tag_values_queryto limit the size of tag-values response. #1068 (@annanay25) - [CHANGE] Reduce MaxSearchBytesPerTrace
ingester.max-search-bytes-per-tracedefault to 5KB #1129 @annanay25 - [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] Remove deprecated method
Pushfromtempopb.Pusher#1173 (@kvrhdn) - [CHANGE] Upgrade cristalhq/hedgedhttp from v0.6.0 to v0.7.0 #1159 (@cristaloleg)
- [CHANGE] Export trace id constant in api package #1176
- [CHANGE] GRPC
1.33.3=>1.38.0broke compatibility withgogoproto.customtype. Enforce the use of gogoproto marshalling/unmarshalling for Tempo, Cortex & Jaeger structs. #1186 (@annanay25) - [CHANGE] BREAKING CHANGE Remove deprecated ingester gRPC endpoint and data encoding. The current data encoding was introduced in v1.0. If running earlier versions, first upgrade to v1.0 through v1.2 and allow time for all blocks to be switched to the "v1" data encoding. #1215 (@mdisibio)
v1.2.1
This patch contains two important bug fixes and is recommended for all users running v1.2.0.
Bug Fixes
- [BUGFIX] Fix defaults for MaxBytesPerTrace (ingester.max-bytes-per-trace) and MaxSearchBytesPerTrace (ingester.max-search-bytes-per-trace) #1109 (@BitProcessor)
- [BUGFIX] Ignore empty objects during compaction #1113 (@mdisibio)
v1.2.0
Breaking Changes
This release contains a number of small breaking changes. They will likely have no impact on your deployment, but it should be noted that due to a change in the API between the query-frontend and querier there may be a temporary read outage during deployment.
- [CHANGE] BREAKING CHANGE Drop support for v0 and v1 blocks. See 1.1 changelog for details #919 (@joe-elliott)
- [CHANGE] BREAKING CHANGE Consolidate status information onto /status endpoint #952 @zalegrala)
The following endpoints moved.
/runtime_configmoved to/status/runtime_config
/configmoved to/status/config
/servicesmoved to/status/services - [CHANGE] BREAKING CHANGE Change ingester metric
ingester_bytes_metric_totalin favor ofingester_bytes_received_total#979 (@mapno) - [CHANGE] Renamed CLI flag from
--storage.trace.maintenance-cycleto--storage.trace.blocklist_poll. This is a BREAKING CHANGE #897 (@mritunjaysharma394) - [CHANGE] BREAKING CHANGE Support partial results from failed block queries #1007 (@mapno)
QuerierGET /querier/api/traces/<traceid>response's body has been modified
to returntempopb.TraceByIDResponseinstead of simplytempopb.Trace. This will cause a disruption of the read path during rollout of the change. - [CHANGE] BRREAKING CHANGE Change the metrics name from
cortex_runtime_config_last_reload_successfultotempo_runtime_config_last_reload_successful#945 (@kavirajk)
New Features and Enhancements
- [FEATURE] Add ability to search ingesters for traces #806 (@mdisibio @kvrhdn @annanay25)
- [FEATURE] Add runtime config handler #936 (@mapno)
- [FEATURE] Add ScalableSingleBinary operational run mode #1004 (@zalegrala)
- [ENHANCEMENT] Added "query blocks" cli option. #876 (@joe-elliott)
- [ENHANCEMENT] Added "search blocks" cli option. #972 (@joe-elliott)
- [ENHANCEMENT] Added traceid to
trace too large message. #888 (@mritunjaysharma394) - [ENHANCEMENT] Add support to tempo workloads to
overridesfrom single configmap in microservice mode. #896 (@kavirajk) - [ENHANCEMENT] Updated config defaults to reflect better capture operational knowledge. #913 (@joe-elliott)
ingester: trace_idle_period: 30s => 10s # reduce ingester memory requirements with little impact on querying flush_check_period: 30s => 10s query_frontend: query_shards: 2 => 20 # will massively improve performance on large installs storage: trace: wal: encoding: none => snappy # snappy has been tested thoroughly and ready for production use block: bloom_filter_false_positive: .05 => .01 # will increase total bloom filter size but improve query performance bloom_filter_shard_size_bytes: 256KiB => 100 KiB # will improve query performance compactor: compaction: chunk_size_bytes: 10 MiB => 5 MiB # will reduce compactor memory needs compaction_window: 4h => 1h # will allow more compactors to participate in compaction without substantially increasing blocks - [ENHANCEMENT] Make s3 backend readError logic more robust #905 (@wei840222)
- [ENHANCEMENT] Add
gen indexandgen bloomcommands to tempo-cli. #903 (@annanay25) - [ENHANCEMENT] Implement trace comparison in Vulture #904 (@zalegrala)
- [ENHANCEMENT] Compression updates: Added s2, improved snappy performance #961 (@joe-elliott)
- [ENHANCEMENT] Add support for vulture sending long running traces #951 (@zalegrala)
- [ENHANCEMENT] Shard tenant index creation by tenant and add functionality to handle stale indexes. #1005 (@joe-elliott)
- [ENHANCEMENT] Support partial results from failed block queries #1007 (@mapno)
- [ENHANCEMENT] Add new metric
tempo_distributor_push_duration_seconds#1027 (@zalegrala) - [ENHANCEMENT] Add query parameter to show the default config values and the difference between the current values and the defaults. #1045 (@MichelHollands)
- [ENHANCEMENT] Adding metrics around ingester flush retries #1049 (@dannykopping)
- [ENHANCEMENT] Performance: More efficient distributor batching #1075 (@joe-elliott)
- [ENHANCEMENT] Include tempo-cli in the release #1086 (@zalegrala)
Bug Fixes
- [BUGFIX] Update port spec for GCS docker-compose example #869 (@zalegrala)
- [BUGFIX] Fix "magic number" errors and other block mishandling when an ingester forcefully shuts down #937 (@mdisibio)
- [BUGFIX] Fix compactor memory leak #806 (@mdisibio)
- [BUGFIX] Set span's tag
span.kindtoclientin query-frontend #975 (@mapno) - [BUGFIX] Fixes
tempodb_backend_hedged_roundtrips_totalto correctly count hedged roundtrips. #1079 (@joe-elliott) - [BUGFIX] Update go-kit logger package to remove spurious debug logs #1094 (@bboreham)
Other Changes
- [CHANGE] update jsonnet alerts and recording rules to use
job_selectorsandcluster_selectorsfor configurable unique identifier labels #935 (@kevinschoonover) - [CHANGE] Add troubleshooting language to config for
server.grpc_server_max_recv_msg_sizeandserver.grpc_server_max_send_msg_sizewhen handling large traces #1023 (@thejosephstevens)