Skip to content

perf(bloom): Compute chunkrefs for series right before sending task to builder #14808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 7, 2024
Prev Previous commit
ForSeries in common pkg
  • Loading branch information
salvacorts committed Nov 7, 2024
commit ec71ebe7781338031a5b8b98e42687ef75946cf9
4 changes: 3 additions & 1 deletion pkg/bloombuild/common/tsdb.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,10 @@ const (
gzipExtension = ".gz"
)

type ForSeries = sharding.ForSeries

type ClosableForSeries interface {
sharding.ForSeries
ForSeries
Close() error
}

Expand Down
4 changes: 2 additions & 2 deletions pkg/bloombuild/planner/strategies/task.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ import (
"github.com/prometheus/common/model"
"github.com/prometheus/prometheus/model/labels"

"github.com/grafana/loki/v3/pkg/bloombuild/common"
"github.com/grafana/loki/v3/pkg/bloombuild/protos"
v1 "github.com/grafana/loki/v3/pkg/storage/bloom/v1"
"github.com/grafana/loki/v3/pkg/storage/config"
"github.com/grafana/loki/v3/pkg/storage/stores/shipper/bloomshipper"
"github.com/grafana/loki/v3/pkg/storage/stores/shipper/indexshipper/tsdb"
"github.com/grafana/loki/v3/pkg/storage/stores/shipper/indexshipper/tsdb/index"
"github.com/grafana/loki/v3/pkg/storage/stores/shipper/indexshipper/tsdb/sharding"
)

type Gap struct {
Expand Down Expand Up @@ -46,7 +46,7 @@ func NewTask(

// ToProtoTask converts a Task to a ProtoTask.
// It will use the opened TSDB to get the chunks for the series in the gaps.
func (t *Task) ToProtoTask(ctx context.Context, forSeries sharding.ForSeries) (*protos.ProtoTask, error) {
func (t *Task) ToProtoTask(ctx context.Context, forSeries common.ForSeries) (*protos.ProtoTask, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

silly question: doesn't this move the memory overhead from NewTSDBSeriesIter to here? why does this use less memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that we defer populating the chunkrefs to right before we send the task to the builder. That way the tasks in the queue do not contain the chunkrefs.

Still after testing this, we moved the memory overhead to keeping all TSDBs open until the build process finishes.

// Populate the gaps with the series and chunks.
protoGaps := make([]protos.Gap, 0, len(t.Gaps))
for _, gap := range t.Gaps {
Expand Down
4 changes: 2 additions & 2 deletions pkg/bloombuild/planner/task.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ type QueueTask struct {
*strategies.Task

// We use forSeries in ToProtoTask to get the chunks for the series in the gaps.
forSeries common.ClosableForSeries
forSeries common.ForSeries

resultsChannel chan *protos.TaskResult

Expand All @@ -29,7 +29,7 @@ func NewQueueTask(
ctx context.Context,
queueTime time.Time,
task *strategies.Task,
forSeries common.ClosableForSeries,
forSeries common.ForSeries,
resultsChannel chan *protos.TaskResult,
) *QueueTask {
return &QueueTask{
Expand Down
Loading