-
Notifications
You must be signed in to change notification settings - Fork 3.7k
perf: Reuse Values in dataobj.Reader to reduce allocs #16988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Timestamp time.Time | ||
Metadata labels.Labels | ||
Line []byte | ||
MdValueCaps []int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to store capacities in a few places in this PR. Is there a better way to return buffers to their original capacity when reusing them?
Every time something turns into a string, we lose the original capacity of the underlying slice so I added this variable to be able to resize them again later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me at where we lose the original capacity of the slice? It's not obvious to me from reviewing the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We lose the underying capacity whenever we turn a []byte into a string.
This one is in Decode: pkg/dataobj/internal/sections/logs/iter.go:141
Then we have to do the same thing for the data in the logs_reader because we copy from an internal logs.Record to an external dataobj.Record.
5a11309
to
f6ae7de
Compare
MdValueCaps []int | ||
} | ||
|
||
func (r *Record) DeepCopy() Record { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I should have this or not. It's supposed to be a helper to a caller can copy elements from the buffer they passed in and can re-use the buffer. Maybe a better solution is to create new buffers from the client when something is returned and never pass the same buffer into a Read call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels reasonable for now (we can optimize or change it as necessary in the future). It natural to me that we provide the disclaimer: "The iterator assumes control of the passed slice; results are only valid until the next iteration. If the caller needs to store results for later use (across iterations), they must deep copy them".
// GrowToCap grows the slice to at least n elements total capacity. | ||
// It is an alternative to slices.Grow that increases the capacity of the slice instead of allowing n new appends. | ||
// This is useful when the slice is expected to have nil values. | ||
func GrowToCap[Slice ~[]E, E any](s Slice, n int) Slice { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found myself using this calculation in quite a few places so I extracted it. The regular call without subtracting the current len would double the slice on every iteration since we tend to have len==cap and assign to values directly thoughout the readers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like where this is heading, but having panics while running logql/bench
that need to be looked into.
@@ -69,7 +69,7 @@ func IterSection(ctx context.Context, dec encoding.StreamsDecoder, section *file | |||
}) | |||
defer r.Close() | |||
|
|||
var rows [1]dataset.Row | |||
var rows [3]dataset.Row |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this specifically changed? There may be nice benefits from batching during iteration, but IMO that can be an isolated change in another PR applied to all our iterator types.
pkg/dataobj/logs_reader.go
Outdated
r.buf = r.buf[:len(s)] | ||
|
||
// Fill the row buffer with empty values so they can re-use the memory we pass in. | ||
for i := range r.buf { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could be extracted so it's only called when necessary. This is because *LogsReader.Read
is repeatedly called with the same buffer. We can take advantage of this to minimize re-initialization of r.buf
:
bufLen := len(r.buf)
sLen := len(s)
if bufLen < sLen {
slicegrow.GrowToCap
r.populate(r.buf[bufLen:])
}
pseudo-code, but hopefully it gets the idea across.
pkg/dataobj/logs_reader.go
Outdated
} | ||
|
||
// Pre-allocate memory for metadata | ||
for i := range s { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit(unrelated, future): It looks like we're initializing metadata columns even if they may be unused. This can (a) likely be minimized so we only pay initialization costs when necessary, this time on the s []Record
argument instead of the internal buf
. This is a bit more complicated because caching that would likely require a type wrapper over []Record
. And (b) it looks like we're initializing capacity based on the total metadata columns (which can be very large), rather than those requested by the query.
Neither optimization needs to be applied today, but it looks like there's some significant opportunity here.
pkg/dataobj/streams_reader.go
Outdated
n, err := r.reader.Read(ctx, r.buf) | ||
if err != nil && !errors.Is(err, io.EOF) { | ||
return 0, fmt.Errorf("reading rows: %w", err) | ||
} else if n == 0 && errors.Is(err, io.EOF) { | ||
return 0, io.EOF | ||
} | ||
|
||
// Pre-allocate memory for metadata | ||
for i := range s { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same optimization opportunities as explained in logs reader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! I really appreciate the effort here, since allocations are our biggest pain point right now.
I have a few primary concerns at the moment before we merge:
-
Creating a
dataset.StringValue
and using it as part of a read can cause obscure memory bugs, even though doing so is technically allowed by the API. -
It also looks like some of our internal optimizations are starting to leak to callers, such as clearing capacities in tests or LogsReaders/StreamReaders preallocating/copying memory.
I think there's still a little bit of work left to reduce how much complexity is being pushed downstream. If you're looking for someone to pair with, I'm happy to help.
// Depending on which type this value was previously used for dictates how we reference the memory. | ||
switch v.any.(type) { | ||
case stringptr: | ||
dst = unsafe.Slice(v.any.(stringptr), int(v.cap)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned with us doing this for stringptr
:
If a caller passes in a string Value created via StringValue, its memory would get overwritten here and violate Go's memory model, which can cause bugs that are very difficult to track down, if the program doesn't segfault outright.
IMO, if it's a stringptr, we should consider eating the cost and allocating a new slice, just to save ourselves from dealing with obscure memory corruption bugs. (And then we should also avoid using string values wherever possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have two options here:
- Allocate when creating a StringValue so the referenced memory is always owned by the Value
- Remove strings completely and just use bytearray values. I think that would clear up a bunch of the handling with labels since converting []byte to strings is what loses the slice capacity information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can remove strings completely, I'm definitely in favour. That would be a breaking change to the format (we wouldn't be able to read old data objects that use strings); I assume that's ok given how early on we are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I've added a commit to remove string types. It's a big change so this PR will be XXL, but it should be self-contained.
Without changing the dataobj.Record to stop using labels.Labels, it doesn't have a big impact on the capacities that need to be maintained. logs.Decode is simpler, though, because the internal logs.Record type can be changed to use []byte
instead of string
for values.
Let me know what you think. I'm happy to revert this commit and tackle it in a future PR if that would be clearer!
@@ -82,7 +82,9 @@ func Test(t *testing.T) { | |||
for result := range logs.Iter(context.Background(), dec) { | |||
record, err := result.Value() | |||
require.NoError(t, err) | |||
actual = append(actual, record) | |||
next := record.DeepCopy() | |||
next.MdValueCaps = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the caller need to zero out MdValueCaps? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just for testing, but require.Equal
also compares the MdValueCaps values. I don't want the test to depend on them because they are an implementation detail, so I zero them out.
I'm happy to move them to a helper method which ignores them. They would also disappear if we remove strings, so I'll trial that first.
Timestamp time.Time | ||
Metadata labels.Labels | ||
Line []byte | ||
MdValueCaps []int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me at where we lose the original capacity of the slice? It's not obvious to me from reviewing the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a pass over the most recent three commits; I'll give this an overall pass shortly
643abda
to
6454b3f
Compare
Latest results on a noop pipeline run
Latest results on LogReader alone:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 Great work! I'm really happy with how this turned out.
I won't have any more rounds of feedback after this, but I'll approve as soon as I'm able to run the LogQL chunk-to-dataobj comparison tests over this.
presenceReader *bufio.Reader | ||
valuesReader *bufio.Reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feedback for a future PR (I'm also happy to contribute this): over the weekend I found it's not really necessary for us to wrap the readers in a bufio.Reader here, and we could update the underlying readers to implement the interfaces we want.
definitely not something I think this PR should do though
pkg/dataobj/querier/iter.go
Outdated
// Copies the label names from src to dst, while re-using the memory for values | ||
func copyLabelNames(dst labels.Labels, src labels.Labels) labels.Labels { | ||
for i, label := range src { | ||
dst[i].Name = strings.Clone(label.Name) | ||
dst[i].Value = label.Value | ||
} | ||
return dst | ||
} | ||
|
||
// Copies the label values into a new slice. We re-use the label names as they were previously copied by Process. | ||
func copyLabelValues(in labels.Labels) labels.Labels { | ||
lb := make(labels.Labels, len(in)) | ||
for i, label := range in { | ||
lb[i] = labels.Label{Name: label.Name, Value: strings.Clone(label.Value)} | ||
} | ||
return lb | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like these are unused (unfortunately it looks like we don't use the linter which helps you remove unused unexported code)
(there might be other unused functions we introduced in earlier commits, but they didn't stand out to me)
stream.LbValueCaps = slicegrow.GrowToCap(stream.LbValueCaps, labelColumns) | ||
stream.LbValueCaps = stream.LbValueCaps[:labelColumns] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to get rid of these for simplicity; is is possible for us to do something similar to what we did with RecordMetadata
in logs.Decode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could but I don't want to do it now.
We use the same streams.Stream object & labels.Labels functionality quite a bit in the getOrAddStream
method (hash, equality, sorting, etc.) and I didn't want to replace it all, at least not in this PR.
In the logs section we don't use that functionality so it was easy to replace labels.Labels with a custom RecordMetadata which references a []byte field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storage equality tests look good!
…7762) Originally, the dataobj package was a higher-level API around sections. This design caused it to become a bottleneck: * Implementing any new public behaviour for a section required bubbling it up to the dataobj API for it to be exposed, making it tedious to add new sections or update existing ones. * The `dataobj.Builder` pattern was focused on constructing dataobjs for storing log data, which will cause friction as we build objects around other use cases. This PR builds on top of the foundation laid out by #17704 and #17708, fully inverting the dependency between dataobj and sections: * The `dataobj` package has no knowledge of what sections exist, and can now be used for writing and reading generic sections. Section packages now create higher-level APIs around the abstractions provided by `dataobj`. * Section packages are now public, and callers interact directly with these packages for writing and reading section-specific data. * All logic for a section (encoding, decoding, buffering, reading) is now fully self-contained inside the section package. Previously, the implementation of each section was spread across three packages (`pkg/dataobj/internal/encoding`, `pkg/dataobj/internal/sections/SECTION`, `pkg/dataobj`). * Cutting a section is now a decision made by the caller rather than the section implementation. Previously, the logs section builder would create multiple sections. For the most part, this change is a no-op, with two exceptions: 1. Section cutting is now performed by the caller; however, this shouldn't result in any issues. 2. Removing the high-level `dataobj.Stream` and `dataobj.Record` types will temporarily reduce the allocation gains from #16988. I will address this after this PR is merged.
What this PR does / why we need it:
Updates the dataobj readers to re-use memory where possible.
Benchmark Results:
The majority of remaining allocs come from downloading the pages themselves. I'll look at them separately to see if we can improve that but I wanted to get this PR reviewed first:

Which issue(s) this PR fixes:
Fixes https://github.com/grafana/loki-private/issues/1471