Skip to content

Fix race condition in active user #6773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

SungJin1212
Copy link
Member

I have noticed a race condition in active_user.go. (https://github.com/cortexproject/cortex/actions/runs/15336995596/job/43156054957?pr=6763)
This PR fixes it.
The logs are:

WARNING: DATA RACE
Read at 0x00c00411e420 by goroutine 996:
  github.com/cortexproject/cortex/pkg/util.(*ActiveUsers).ActiveUsers()
      /__w/cortex/cortex/pkg/util/active_user.go:98 +0x87
  github.com/cortexproject/cortex/pkg/util.(*ActiveUsersCleanupService).ActiveUsers()
      /__w/cortex/cortex/pkg/util/active_user.go:146 +0x139
  github.com/cortexproject/cortex/pkg/distributor.(*Distributor).updateLabelSetMetrics()
      /__w/cortex/cortex/pkg/distributor/distributor.go:814 +0x104
  github.com/cortexproject/cortex/pkg/distributor.(*Distributor).running()
      /__w/cortex/cortex/pkg/distributor/distributor.go:489 +0x4b1
  github.com/cortexproject/cortex/pkg/distributor.(*Distributor).running-fm()
      <autogenerated>:1 +0x47
  github.com/cortexproject/cortex/pkg/util/services.(*BasicService).main()
      /__w/cortex/cortex/pkg/util/services/basic_service.go:190 +0x3ab
  github.com/cortexproject/cortex/pkg/util/services.(*BasicService).StartAsync.func1.gowrap1()
      /__w/cortex/cortex/pkg/util/services/basic_service.go:119 +0x33

Previous write at 0x00c00411e420 by goroutine 1000:
  runtime.mapassign()
      /usr/local/go/src/internal/runtime/maps/runtime_swiss.go:191 +0x0
  github.com/cortexproject/cortex/pkg/util.(*ActiveUsers).PurgeInactiveUsers()
      /__w/cortex/cortex/pkg/util/active_user.go:80 +0x3f7
  github.com/cortexproject/cortex/pkg/util.(*ActiveUsersCleanupService).iteration()
      /__w/cortex/cortex/pkg/util/active_user.go:138 +0x144
  github.com/cortexproject/cortex/pkg/util.(*ActiveUsersCleanupService).iteration-fm()
      <autogenerated>:1 +0x47
  github.com/cortexproject/cortex/pkg/util.NewActiveUsersCleanupService.NewTimerService.func1()
      /__w/cortex/cortex/pkg/util/services/services.go:33 +0x1d8
  github.com/cortexproject/cortex/pkg/util/services.(*BasicService).main()
      /__w/cortex/cortex/pkg/util/services/basic_service.go:190 +0x3ab
  github.com/cortexproject/cortex/pkg/util/services.(*BasicService).StartAsync.func1.gowrap1()
      /__w/cortex/cortex/pkg/util/services/basic_service.go:119 +0x33

The reason is that go-routines could access m.timestamps simultaneously.

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Copy link
Contributor

@justinjung04 justinjung04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks

Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch 👍

@yeya24 yeya24 merged commit cf407e4 into cortexproject:master May 30, 2025
15 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants