Provide defaults for index sort settings#135886
Conversation
|
Hi @jordan-powers, I've created a changelog YAML for you. |
| return new FieldSortSpec[0]; | ||
| } | ||
|
|
||
| String indexMode = settings.get(IndexSettings.MODE.getKey()); |
There was a problem hiding this comment.
Can't use IndexSettings.MODE.get(settings) here because the validation logic for IndexSettings.MODE uses the default value of index.sort.*, which causes infinite recursion (since we're already in the default value provider for those settings).
So we need to get the mode while bypassing the validation.
There was a problem hiding this comment.
Maybe add this also as a comment?
| List<String> asList = defaultSettings.getAsList(getKey(), null); | ||
| if (asList == null) { | ||
| builder.putList(getKey(), defaultStringValue.apply(defaultSettings)); | ||
| builder.putList(getKey(), defaultStringValue.apply(source)); |
There was a problem hiding this comment.
I think this is a bug, but I'm not sure why we never saw it before. Maybe we never had string list settings whose default value depended on other settings before?
There was a problem hiding this comment.
Nice find, definitely looks like a bug to me
There was a problem hiding this comment.
Good find, I agree that the current behaviour isn't correct and actual settings need to be pushed down instead of default settings..
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| return Arrays.stream(getDefaultSortSpecs(settings)).map(sortSpec -> sortSpec.field).toList(); | ||
| } | ||
|
|
||
| public static List<String> getDefaultSortOrder(Settings settings) { |
There was a problem hiding this comment.
I believe there are situations where non-default and default settings can be mixed incorrectly. For example, if mode=logsdb, and only a sort field is set, the order, mode, missing settings will use the defaults for timestamp sort.
this test (added to 80_index_sort_defaults.yml) shows the issue:
---
create logsdb data stream with non-default sort field:
- do:
cluster.put_component_template:
name: "logsdb-mappings"
body:
template:
settings:
mode: "logsdb"
index.sort.field: ["some_field"]
mappings:
properties:
some_field:
type: "keyword"
"@timestamp":
type: "date"
- do:
indices.put_index_template:
name: "logsdb-index-template"
body:
index_patterns: ["logsdb"]
data_stream: {}
composed_of: ["logsdb-mappings"]
allowed_warnings:
- "index template [logsdb-index-template] has index patterns [logsdb] matching patterns from existing older templates [global] with patterns (global => [*]); this template [logsdb-index-template] will take precedence during new index creation"
- do:
indices.create_data_stream:
name: "logsdb"
- is_true: acknowledged
- do:
indices.get_data_stream:
name: "logsdb"
expand_wildcards: hidden
- length: { data_streams: 1 }
- set: { data_streams.0.indices.0.index_name: backing_index }
- do:
indices.get_settings:
index: $backing_index
include_defaults: true
- match: { .$backing_index.settings.index.mode: logsdb }
- match: { .$backing_index.settings.index.sort.field: [ "some_field" ] }
- match: { .$backing_index.defaults.index.sort.order: [ "asc" ] }
- match: { .$backing_index.defaults.index.sort.mode: [ "min" ] }
- match: { .$backing_index.defaults.index.sort.missing: [ "_last" ] }
There was a problem hiding this comment.
Good catch. The current logic does ensure the defaults aren't actually applied, but they're still reported via the settings api. I'll update the logic so that if a custom sort is defined, the defaults match index.sort.fields.
martijnvg
left a comment
There was a problem hiding this comment.
This looks good Jordan! I left a few questions.
| List<String> asList = defaultSettings.getAsList(getKey(), null); | ||
| if (asList == null) { | ||
| builder.putList(getKey(), defaultStringValue.apply(defaultSettings)); | ||
| builder.putList(getKey(), defaultStringValue.apply(source)); |
There was a problem hiding this comment.
Good find, I agree that the current behaviour isn't correct and actual settings need to be pushed down instead of default settings..
| return new FieldSortSpec[0]; | ||
| } | ||
|
|
||
| String indexMode = settings.get(IndexSettings.MODE.getKey()); |
There was a problem hiding this comment.
Maybe add this also as a comment?
| - match: { .$backing_index.mappings.properties.@timestamp.type: date } | ||
| - match: { .$backing_index.mappings.properties.host.properties.name.type: keyword } | ||
| - match: { .$backing_index.mappings.properties.host.properties.name.ignore_above: 1024 } | ||
| - match: { .$backing_index.mappings.properties.host: null } |
There was a problem hiding this comment.
Why does this need to be changed? iirc these fields get injected by default if index mode is logsdb?
There was a problem hiding this comment.
The update I made to LogsdbIndexModeSettingsProvider was causing this test to fail.
We don't inject the host.name mapper when there is a custom sort configured. In this test, we configure index.sort.field to be an empty array. Previously, we checked for a custom sort by checking if index.sort.field is non-empty. We can't do that anymore because index.sort.field now has a proper default and will be sometimes non-empty even when there is no custom sort configured. Now, we check for a custom sort by actually checking if index.sort.field is configured.
| - match: { .$backing_index.defaults.index.sort.mode: [ "max" ] } | ||
| - match: { .$backing_index.defaults.index.sort.missing: [ "_last" ] } | ||
| --- | ||
| create logsdb data stream with host as text and name as double: |
There was a problem hiding this comment.
This is a strange setup, but I see how this can work and using host.name as primary sort field.
There was a problem hiding this comment.
Yeah it's definitely a strange setup, but I found it tested in 30_logsdb_default_mapping, and I figured it couldn't hurt to test the edge case.
| "mode": "time_series", | ||
| "sort.field": [], | ||
| "sort.order": [] | ||
| "sort.field": ["_tsid", "@timestamp"], |
There was a problem hiding this comment.
Shouldn't the sort.field / sort.order index settings be derived from the defaults? Or maybe the settings shouldn't be defined in this test?
There was a problem hiding this comment.
Omitting the sort.field and sort.order results in the error updating component template [logs@custom] results in invalid composable template [logs-apm.error@template] after templates are merged.
From what I can tell, the logs-apm.error@template template is composed of logs@custom which we define here, and apm@settings which contains a non-default index sort configuration. Since this this template sets the index mode to time_series, we need to override the index sort from apm@settings with another index sort here.
💔 Backport failed
You can use sqren/backport to manually backport by running |
|
After some discussion, we decided not to backport this. |
This PR optimizes the default providers for the sort settings index.sort.field, index.sort.order, index.sort.mode, and index.sort.missing to fix a performance regression in the many-shards-quantitative benchmark associated with the introduction of these default providers in #135886.
This PR configures default values for the index settings
index.sort.field,index.sort.order,index.sort.mode, andindex.sort.missing.Fixes #129062