Disk usage don't include synthetic _id postings#138745
Conversation
|
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
|
Hi @burqen, I've created a changelog YAML for you. |
|
I didn't add any tests for it but it will be tested through |
| continue; | ||
| } | ||
| if (SyntheticIdField.hasSyntheticIdAttributes(field.attributes())) { | ||
| // Synthetic _id field doesn't have an inverted index stored on disk, |
There was a problem hiding this comment.
maybe we can assert that the terms class is not any of the ones expected in getBlockTermState?
There was a problem hiding this comment.
Yes, some assertion here is a good idea.
I would rather assert that the Terms / TermsEnum / TermState is of the class that we expect but I cannot do that on this branch, since that will change in later PRs and we are not even reaching this if-branch on this git-branch #overloaded . Perhaps we can add that assertion in later PRs?
I'll at least assert that the field is _id as expected.
…ndex-disk-size-analysis-for-synthetic-id
Synthetic
_idfields doesn't have an inverted index, but will pretend to have it on the read path by injectingIndexOptions.DOCSon the_idfield. We short circuitIndexDiskUsageAnalyzerif we see a synthetic_idfield.