perf: Reuse Values in dataobj.Reader to reduce allocs #16988

benclive · 2025-04-01T14:47:26Z

What this PR does / why we need it:
Updates the dataobj readers to re-use memory where possible.

This uses the []Value slices passed in by the caller to store & reuse memory for batch operations.
I had to do this at 3 layers: The Values themselves, copying Values into logs.Record during Decode and then copying the internal logs.Record over to the dataobj.Record for output to the user.
- From my testing, we can gain another 5-10% perf increase by eliminating that last copy, but it means we won't have clean separation between internal and external types so I left it in for now.

Benchmark Results:

goos: darwin
goarch: arm64
pkg: github.com/grafana/loki/v3/pkg/dataobj
cpu: Apple M3 Max
              │ before_logs_reader.txt │        after_logs_reader.txt        │
              │         sec/op         │   sec/op     vs base                │
LogsReader-14              2.648m ± 1%   2.325m ± 2%  -12.19% (p=0.000 n=10)

              │ before_logs_reader.txt │         after_logs_reader.txt         │
              │          B/op          │     B/op       vs base                │
LogsReader-14            1282.8Ki ± 5%   218.0Ki ± 22%  -83.01% (p=0.000 n=10)

              │ before_logs_reader.txt │       after_logs_reader.txt        │
              │       allocs/op        │ allocs/op   vs base                │
LogsReader-14             60472.5 ± 0%   455.0 ± 0%  -99.25% (p=0.000 n=10)

The majority of remaining allocs come from downloading the pages themselves. I'll look at them separately to see if we can improve that but I wanted to get this PR reviewed first:

Which issue(s) this PR fixes:
Fixes https://github.com/grafana/loki-private/issues/1471

benclive · 2025-04-01T14:49:04Z

pkg/dataobj/internal/sections/logs/logs.go

+	Timestamp   time.Time
+	Metadata    labels.Labels
+	Line        []byte
+	MdValueCaps []int


I had to store capacities in a few places in this PR. Is there a better way to return buffers to their original capacity when reusing them?

Every time something turns into a string, we lose the original capacity of the underlying slice so I added this variable to be able to resize them again later.

Can you point me at where we lose the original capacity of the slice? It's not obvious to me from reviewing the PR.

We lose the underying capacity whenever we turn a []byte into a string.
This one is in Decode: pkg/dataobj/internal/sections/logs/iter.go:141
Then we have to do the same thing for the data in the logs_reader because we copy from an internal logs.Record to an external dataobj.Record.

benclive · 2025-04-02T16:52:50Z

pkg/dataobj/internal/sections/logs/logs.go

+	MdValueCaps []int
+}
+
+func (r *Record) DeepCopy() Record {


I'm not sure if I should have this or not. It's supposed to be a helper to a caller can copy elements from the buffer they passed in and can re-use the buffer. Maybe a better solution is to create new buffers from the client when something is returned and never pass the same buffer into a Read call?

This feels reasonable for now (we can optimize or change it as necessary in the future). It natural to me that we provide the disclaimer: "The iterator assumes control of the passed slice; results are only valid until the next iteration. If the caller needs to store results for later use (across iterations), they must deep copy them".

benclive · 2025-04-02T16:58:39Z

pkg/dataobj/internal/util/slicegrow.go

+// GrowToCap grows the slice to at least n elements total capacity.
+// It is an alternative to slices.Grow that increases the capacity of the slice instead of allowing n new appends.
+// This is useful when the slice is expected to have nil values.
+func GrowToCap[Slice ~[]E, E any](s Slice, n int) Slice {


I found myself using this calculation in quite a few places so I extracted it. The regular call without subtracting the current len would double the slice on every iteration since we tend to have len==cap and assign to values directly thoughout the readers.

owen-d

I like where this is heading, but having panics while running logql/bench that need to be looked into.

owen-d · 2025-04-02T20:07:27Z

pkg/dataobj/internal/sections/streams/iter.go

@@ -69,7 +69,7 @@ func IterSection(ctx context.Context, dec encoding.StreamsDecoder, section *file
 		})
 		defer r.Close()

-		var rows [1]dataset.Row
+		var rows [3]dataset.Row


Why was this specifically changed? There may be nice benefits from batching during iteration, but IMO that can be an isolated change in another PR applied to all our iterator types.

owen-d · 2025-04-02T20:25:03Z

pkg/dataobj/logs_reader.go

 	r.buf = r.buf[:len(s)]

+	// Fill the row buffer with empty values so they can re-use the memory we pass in.
+	for i := range r.buf {


nit: this could be extracted so it's only called when necessary. This is because *LogsReader.Read is repeatedly called with the same buffer. We can take advantage of this to minimize re-initialization of r.buf:

bufLen := len(r.buf) sLen := len(s) if bufLen < sLen { slicegrow.GrowToCap r.populate(r.buf[bufLen:]) }

pseudo-code, but hopefully it gets the idea across.

owen-d · 2025-04-02T20:30:24Z

pkg/dataobj/logs_reader.go

+	}
+
+	// Pre-allocate memory for metadata
+	for i := range s {


nit(unrelated, future): It looks like we're initializing metadata columns even if they may be unused. This can (a) likely be minimized so we only pay initialization costs when necessary, this time on the s []Record argument instead of the internal buf. This is a bit more complicated because caching that would likely require a type wrapper over []Record. And (b) it looks like we're initializing capacity based on the total metadata columns (which can be very large), rather than those requested by the query.

Neither optimization needs to be applied today, but it looks like there's some significant opportunity here.

owen-d · 2025-04-02T20:33:19Z

pkg/dataobj/streams_reader.go

 	n, err := r.reader.Read(ctx, r.buf)
 	if err != nil && !errors.Is(err, io.EOF) {
 		return 0, fmt.Errorf("reading rows: %w", err)
 	} else if n == 0 && errors.Is(err, io.EOF) {
 		return 0, io.EOF
 	}

+	// Pre-allocate memory for metadata
+	for i := range s {


nit: same optimization opportunities as explained in logs reader

rfratto

Thank you for working on this! I really appreciate the effort here, since allocations are our biggest pain point right now.

I have a few primary concerns at the moment before we merge:

Creating a dataset.StringValue and using it as part of a read can cause obscure memory bugs, even though doing so is technically allowed by the API.
It also looks like some of our internal optimizations are starting to leak to callers, such as clearing capacities in tests or LogsReaders/StreamReaders preallocating/copying memory.

I think there's still a little bit of work left to reduce how much complexity is being pushed downstream. If you're looking for someone to pair with, I'm happy to help.

pkg/dataobj/internal/dataset/page_reader.go

pkg/dataobj/internal/dataset/reader_basic.go

pkg/dataobj/internal/dataset/value.go

pkg/dataobj/internal/dataset/value_encoding_plain.go

rfratto · 2025-04-04T20:12:37Z

pkg/dataobj/internal/dataset/value_encoding_plain.go

+	// Depending on which type this value was previously used for dictates how we reference the memory.
+	switch v.any.(type) {
+	case stringptr:
+		dst = unsafe.Slice(v.any.(stringptr), int(v.cap))


I'm concerned with us doing this for stringptr:

If a caller passes in a string Value created via StringValue, its memory would get overwritten here and violate Go's memory model, which can cause bugs that are very difficult to track down, if the program doesn't segfault outright.

IMO, if it's a stringptr, we should consider eating the cost and allocating a new slice, just to save ourselves from dealing with obscure memory corruption bugs. (And then we should also avoid using string values wherever possible)

I think we have two options here:

Allocate when creating a StringValue so the referenced memory is always owned by the Value

Remove strings completely and just use bytearray values. I think that would clear up a bunch of the handling with labels since converting []byte to strings is what loses the slice capacity information.

If we can remove strings completely, I'm definitely in favour. That would be a breaking change to the format (we wouldn't be able to read old data objects that use strings); I assume that's ok given how early on we are?

I agree. I've added a commit to remove string types. It's a big change so this PR will be XXL, but it should be self-contained.
Without changing the dataobj.Record to stop using labels.Labels, it doesn't have a big impact on the capacities that need to be maintained. logs.Decode is simpler, though, because the internal logs.Record type can be changed to use []byte instead of string for values.
Let me know what you think. I'm happy to revert this commit and tackle it in a future PR if that would be clearer!

rfratto · 2025-04-04T20:19:46Z

pkg/dataobj/internal/sections/logs/logs_test.go

@@ -82,7 +82,9 @@ func Test(t *testing.T) {
 	for result := range logs.Iter(context.Background(), dec) {
 		record, err := result.Value()
 		require.NoError(t, err)
-		actual = append(actual, record)
+		next := record.DeepCopy()
+		next.MdValueCaps = nil


Why does the caller need to zero out MdValueCaps? 🤔

This is just for testing, but require.Equal also compares the MdValueCaps values. I don't want the test to depend on them because they are an implementation detail, so I zero them out.
I'm happy to move them to a helper method which ignores them. They would also disappear if we remove strings, so I'll trial that first.

rfratto · 2025-04-04T20:21:07Z

pkg/dataobj/internal/sections/logs/logs.go

+	Timestamp   time.Time
+	Metadata    labels.Labels
+	Line        []byte
+	MdValueCaps []int


Can you point me at where we lose the original capacity of the slice? It's not obvious to me from reviewing the PR.

pkg/dataobj/internal/util/slicegrow.go

pkg/dataobj/logs_reader.go

rfratto

This is just a pass over the most recent three commits; I'll give this an overall pass shortly

pkg/dataobj/internal/dataset/value.go

pkg/dataobj/internal/dataset/reader_basic.go

pkg/dataobj/internal/metadata/datasetmd/datasetmd.proto

pkg/dataobj/internal/sections/logs/logs.go

benclive · 2025-04-09T17:06:20Z

Latest results on LogReader alone:

goos: darwin
goarch: arm64
pkg: github.com/grafana/loki/v3/pkg/dataobj
cpu: Apple M3 Max
              │ before_logs_reader.txt │       after_logs_reader.txt        │
              │         sec/op         │   sec/op     vs base               │
LogsReader-14              2.648m ± 1%   2.462m ± 1%  -7.03% (p=0.000 n=10)

              │ before_logs_reader.txt │        after_logs_reader.txt         │
              │          B/op          │     B/op      vs base                │
LogsReader-14            1282.8Ki ± 5%   223.3Ki ± 1%  -82.59% (p=0.000 n=10)

              │ before_logs_reader.txt │       after_logs_reader.txt        │
              │       allocs/op        │ allocs/op   vs base                │
LogsReader-14             60472.5 ± 0%   555.0 ± 0%  -99.08% (p=0.000 n=10)

rfratto

🎉 Great work! I'm really happy with how this turned out.

I won't have any more rounds of feedback after this, but I'll approve as soon as I'm able to run the LogQL chunk-to-dataobj comparison tests over this.

rfratto · 2025-04-10T17:33:20Z

pkg/dataobj/internal/dataset/page_reader.go

+	presenceReader *bufio.Reader
+	valuesReader   *bufio.Reader


feedback for a future PR (I'm also happy to contribute this): over the weekend I found it's not really necessary for us to wrap the readers in a bufio.Reader here, and we could update the underlying readers to implement the interfaces we want.

definitely not something I think this PR should do though

pkg/dataobj/internal/dataset/reader_basic.go

pkg/dataobj/internal/sections/logs/iter.go

pkg/dataobj/internal/util/symbolizer/symbolizer.go

rfratto · 2025-04-10T18:15:09Z

pkg/dataobj/querier/iter.go

+// Copies the label names from src to dst, while re-using the memory for values
+func copyLabelNames(dst labels.Labels, src labels.Labels) labels.Labels {
+	for i, label := range src {
+		dst[i].Name = strings.Clone(label.Name)
+		dst[i].Value = label.Value
+	}
+	return dst
+}
+
+// Copies the label values into a new slice. We re-use the label names as they were previously copied by Process.
+func copyLabelValues(in labels.Labels) labels.Labels {
+	lb := make(labels.Labels, len(in))
+	for i, label := range in {
+		lb[i] = labels.Label{Name: label.Name, Value: strings.Clone(label.Value)}
+	}
+	return lb
+}
+


looks like these are unused (unfortunately it looks like we don't use the linter which helps you remove unused unexported code)

(there might be other unused functions we introduced in earlier commits, but they didn't stand out to me)

pkg/logql/log/pipeline.go

rfratto · 2025-04-10T18:27:07Z

pkg/dataobj/internal/sections/streams/iter.go

+	stream.LbValueCaps = slicegrow.GrowToCap(stream.LbValueCaps, labelColumns)
+	stream.LbValueCaps = stream.LbValueCaps[:labelColumns]


It'd be nice to get rid of these for simplicity; is is possible for us to do something similar to what we did with RecordMetadata in logs.Decode?

We could but I don't want to do it now.
We use the same streams.Stream object & labels.Labels functionality quite a bit in the getOrAddStream method (hash, equality, sorting, etc.) and I didn't want to replace it all, at least not in this PR.
In the logs section we don't use that functionality so it was easy to replace labels.Labels with a custom RecordMetadata which references a []byte field.

rfratto

Storage equality tests look good!

…7762) Originally, the dataobj package was a higher-level API around sections. This design caused it to become a bottleneck: * Implementing any new public behaviour for a section required bubbling it up to the dataobj API for it to be exposed, making it tedious to add new sections or update existing ones. * The `dataobj.Builder` pattern was focused on constructing dataobjs for storing log data, which will cause friction as we build objects around other use cases. This PR builds on top of the foundation laid out by #17704 and #17708, fully inverting the dependency between dataobj and sections: * The `dataobj` package has no knowledge of what sections exist, and can now be used for writing and reading generic sections. Section packages now create higher-level APIs around the abstractions provided by `dataobj`. * Section packages are now public, and callers interact directly with these packages for writing and reading section-specific data. * All logic for a section (encoding, decoding, buffering, reading) is now fully self-contained inside the section package. Previously, the implementation of each section was spread across three packages (`pkg/dataobj/internal/encoding`, `pkg/dataobj/internal/sections/SECTION`, `pkg/dataobj`). * Cutting a section is now a decision made by the caller rather than the section implementation. Previously, the logs section builder would create multiple sections. For the most part, this change is a no-op, with two exceptions: 1. Section cutting is now performed by the caller; however, this shouldn't result in any issues. 2. Removing the high-level `dataobj.Stream` and `dataobj.Record` types will temporarily reduce the allocation gains from #16988. I will address this after this PR is merged.

pull-request-size bot added the size/L label Apr 1, 2025

benclive commented Apr 1, 2025

View reviewed changes

benclive changed the title ~~Reuse values in dataobj reader~~ Apr 1, 2025

pull-request-size bot added size/XL and removed size/L labels Apr 2, 2025

benclive force-pushed the reuse-values-in-dataobjReader branch from 5a11309 to f6ae7de Compare April 2, 2025 15:51

benclive marked this pull request as ready for review April 2, 2025 16:49

benclive requested a review from a team as a code owner April 2, 2025 16:49

benclive requested review from rfratto and ashwanthgoli April 2, 2025 16:49

benclive commented Apr 2, 2025

View reviewed changes

owen-d reviewed Apr 2, 2025

View reviewed changes

rfratto reviewed Apr 4, 2025

View reviewed changes

pull-request-size bot added size/XXL and removed size/XL labels Apr 8, 2025

benclive requested a review from rfratto April 8, 2025 12:36

rfratto reviewed Apr 8, 2025

View reviewed changes

benclive added 12 commits April 9, 2025 17:40

perf: Reuse Values when reading data objects

47f5044

Remove allocs in Decode

d6685ce

Extract slicegrow util

5f3ff6e

Fixing test case for edge of page memory re-use

fd26fcf

Fixup test cases

d8c8a5f

Adjust copying out of logs.Decode

20d129f

Linting

1298c21

Add DeepCopy

67e94e6

Linting

13dc221

Fix memory re-use in entryIterators

15b5109

Reset via nil instead of empty reader

7dff024

PR feedback: Unleak Value methods, clean up inits

442e275

benclive added 7 commits April 9, 2025 17:40

PR feedback: Rename methods & remove buf init

47ae3f8

Remove string type from dataobj

4f8c1a3

PR feedback: Update comments and small fixes

439eeb3

Use symbolizer on logs_reader and stream_reader for output

6efb96c

Use symbolizer on logs_reader and stream_reader for output

51d0a6f

Use cache for normalizing labels in noopPipeline

9fcb9ca

Reorder shard tests

6454b3f

benclive force-pushed the reuse-values-in-dataobjReader branch from 643abda to 6454b3f Compare April 9, 2025 17:05

benclive requested a review from rfratto April 9, 2025 17:18

Format imports

147d9bd

rfratto reviewed Apr 10, 2025

View reviewed changes

benclive and others added 7 commits April 11, 2025 10:40

PR feedback: Comments and small updates

8a1c65e

remove unused funcs

b7ad181

Merge branch 'main' into reuse-values-in-dataobjReader

e1b5da7

Merge branch 'main' into reuse-values-in-dataobjReader

bb59933

fixup! Merge branch 'main' into reuse-values-in-dataobjReader

2253f55

Merge branch 'main' into reuse-values-in-dataobjReader

9e3aa91

Correctly add strings to the symbolizer

2eaafa6

rfratto approved these changes Apr 14, 2025

View reviewed changes

benclive merged commit e5784d7 into main Apr 14, 2025
61 checks passed

benclive deleted the reuse-values-in-dataobjReader branch April 14, 2025 12:56

rfratto mentioned this pull request May 20, 2025

refactor(dataobj): invert dependency between dataobj and sections #17762

Merged

loki-gh-app bot mentioned this pull request Jun 2, 2025

chore(k257): release 3.5.0 #17926

Open

This was referenced Jun 24, 2025

chore(k260): release 3.5.0 #18210

Open

chore(k261): release 3.5.0 #18278

Open

		stream.LbValueCaps = slicegrow.GrowToCap(stream.LbValueCaps, labelColumns)
		stream.LbValueCaps = stream.LbValueCaps[:labelColumns]

perf: Reuse Values in dataobj.Reader to reduce allocs #16988

perf: Reuse Values in dataobj.Reader to reduce allocs #16988

Uh oh!

Conversation

benclive commented Apr 1, 2025

benclive Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benclive Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

owen-d left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rfratto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rfratto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

benclive commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

rfratto left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rfratto Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

benclive Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

rfratto left a comment

Choose a reason for hiding this comment

Uh oh!

benclive Apr 1, 2025 •

edited

Loading

benclive Apr 2, 2025 •

edited

Loading

benclive commented Apr 9, 2025 •

edited

Loading

rfratto Apr 10, 2025 •

edited

Loading

benclive Apr 11, 2025 •

edited

Loading