Skip to content

chore: Add new pointers section type for dataobj #18243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 27, 2025

Conversation

benclive
Copy link
Contributor

What this PR does / why we need it:
Adds a new type of section ("Pointers") for dataobjs, storing indexing information at a object-section granularity.

  • I am open to changing the name from Pointers if anyone has any suggestions.
  • I opted for a wide-section approach instead of two separate section types: This section stores both stream indexing information (ID, start, end, size, etc.) & column indexing information (blooms for column values), which are mutually exclusive.
  • The exact fields may change over time, I've tested indexing E2E and I'm happy with this as a first pass.

Which issue(s) this PR fixes:
Part of https://github.com/grafana/loki-private/issues/1725

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR
UncompressedSize int64

// Column indexing metadata
Column string
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to have ColumnName as a separate column, instead of having one column per referenced column & type. I might revisit this decision in the future if it impacts read performance by needing to scan through blooms for columns we don't care about.
e.g.

path | section |  column   | bloom bytes
-----------------------------------------
abcd |    1    | namespace | []byte{..}
abcd |    1    | cluster   | []byte{..}

instead of

path | section | namespace_bloom | cluster_bloom
abcd |    1    |    []byte{..}   | 
abcd |    1    |                 | []byte{..}
PageSizeHint: b.pageSize,
Value: datasetmd.VALUE_TYPE_BYTE_ARRAY,
Encoding: datasetmd.ENCODING_TYPE_PLAIN,
Compression: datasetmd.COMPRESSION_TYPE_NONE, // TODO: is there a sensible compression algorithm for bloom filters?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions for this TODO?
I'm tempted to leave it as NONE for two reasons: Bloom bytes are likely fairly random so won't compress well, and so we don't have to spend CPU decompressing the blooms when reading them so they should be very fast to scan.
At some point I would like to try roaring bitmaps here though.

@benclive benclive marked this pull request as ready for review June 26, 2025 10:23
@benclive benclive requested a review from a team as a code owner June 26, 2025 10:23
//
// The stream indexing metadata is used to lookup which streams in the referenced section, and their ID within that section.
// The column indexing metadata is used to lookup which column values are present in the referenced section.
// Path & Section are mandatory fields, and are used to uniquely identify the section within the referenced object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you verify that sections are returned in deterministic order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. @rfratto If I pass a section number back to the query engine, will they then be able to read exactly the same section later based on object path + section number?
The index number is extracted from iteration order of the Sections() call, which looks deterministic to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written it as deterministic, though I don't think the API contract promises this. It could be valuable to add it to the comments in the dataobj package to solidify it.

@@ -0,0 +1,169 @@
package pointers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a lot of duplicate decoder code for each section. The implementation of Columns() Pages() ReadPages() are identical, except for the imports of ColumnDesc, PageDesc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Robert addressed this below. Theres quite a bit of duplicate code between the different section types and I think we can clean it up soon-ish. I might take a stab at refactoring at some point.

Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM for us to start experimenting with. I'm a little concerned about the cognitive cost of having a pointer be used for indexing two distinct things, but I think we can make that easier to reason about by tweaking the columns or even the API a little (see my comment below).

Obviously I know where to look when reviewing a new dataset-based section, so I was able to easily ignore all the boilerplate. If we think we're going to be adding many more sections, it might be time to start thinking about how to deduplicate the logic between them. I don't think that needs to block this PR though; the duplicated code is less of a maintenance burden and more of a implementation burden.

COLUMN_TYPE_UNCOMPRESSED_SIZE = 8;

// COLUMN_TYPE_COLUMN_NAME is the name of a column this object is storing indexes about
COLUMN_TYPE_COLUMN_NAME = 9;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to include column index here as well; not all columns have names and column names aren't guaranteed to be unique (which will start coming into play in the future once we store parsed fields)

Comment on lines +22 to +23
// 1. Stream indexing metadata
// 2. Column indexing metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I do find it a little confusing that a pointer is to a stream or a column. I think there's a few options to make that clearer:

  • Store a column indicating the type of row (STREAM_POINTER vs COLUMN_POINTER), or
  • Expose a method on ObjPointer to compute the type of row (StreamPointer vs ColumnPointer)
  • Make it clearer in the comment here that the type of index information is either/or per row and never both
  • Split out into two section for stream pointers and another for column pointers
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for "Store a column indicating the type of row (STREAM_POINTER vs COLUMN_POINTER)"
I'm not sure how much clearer it made it: A follow up PR will be actually using this section in a builder so I might revisit it there if I think of a better way.
The only real reason to have a single section is to reduce duplicate code and boilerplate. If extracting all the duplicated code reduces that, I doubt there is much harm in having different sections for different kinds of index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this file? I'm hoping to phase out the iter.go files in the other sections in favour of using a reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not really needed, I just copied from the existing streams implementation, which also may have changed in the mean time. I want to use the column reader at some point so I'll clean it up then.

)

// RowReader reads the set of streams from an [Object].
type RowReader struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if you don't suspect we'd ever need columnar readers for the pointers section, we can just call this Reader.

I plan to eventually drop RowReader from the streams/logs sections (but that's not to say other sections can't have row-based readers).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do plan to use the column reader so I can avoid reading log lines and make building a bit faster. Its currently ~40 seconds to process a full object in my local testing.
That said, there is likely a time in the near future where we'll want to parse log lines for things like UUIDs so maybe I'll end up using the row reader then since I'll be looking at all the columns. TBD!

@benclive benclive requested review from chaudum and rfratto June 26, 2025 17:16
@@ -13,12 +13,21 @@ import (
"github.com/grafana/loki/v3/pkg/dataobj"
)

<<<<<<< Updated upstream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benclive bad merge conflict resolution here

Copy link
Member

@rfratto rfratto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM 😛

@benclive benclive enabled auto-merge (squash) June 27, 2025 12:58
@benclive benclive merged commit 9642af3 into main Jun 27, 2025
65 checks passed
@benclive benclive deleted the benclive/add-new-pointers-section-to-dataobj branch June 27, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants