Skip to content

Single query-frontend pod has large spike in latency after k8s node upgrade #18186

Open
@snuggie12

Description

@snuggie12

Describe the bug
While performing a GKE node upgrade on more than one occasion, we've experienced a situation where the P99 latency on a single query-frontend jumps from near 0 to 90s. I'm unsure how the frontend works WRT to HA, but all users start to see timeouts effectively breaking all queries.

This is not new AFAIK. The first time we noticed it was in Oct 2024 and we try to keep our loki up to date so we've been through multiple versions. This also happened in Jan 2025.

The only other thing we noticed is that the single pod starts to complain about ast mappings with our loki canary. Here's an example log line that only appears on the one bad frontend:

ts=2025-06-19T01:07:44.869377037Z caller=spanlogger.go:111 middleware=QueryShard.astMapperware org_id=fake user=fake caller=log.go:168 level=warn msg="failed mapping AST" err="context canceled" query="count_over_time({stream=\"stdout\",pod=\"loki-main-canary-pclr4\"}[462s])"

To Reproduce
Steps to reproduce the behavior:

  1. Have queriers set to pull mode, have query-frontend HA setup and use a separate scheduler. Also use the loki canary.
  2. Perform a kubernetes upgrade or possibly whatever will cause every node to be drained with replacements? I would think this does not matter, but perhaps the behaviour is different from a kubectl rollout restart enough that the hashring or something has an issue.
  3. All queries have issues and one query-frontend has increased latency.

Expected behavior
Able to handle a k8s node upgrade

Environment:

  • Infrastructure: kubernetes
  • Deployment tool: helm
  • Version: Currently 3.4.2 but has been on other 3.X versions

Screenshots, Promtail config, or terminal output
If applicable, add any output to help explain your problem.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions