Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
Adding tests for new zone awareness behavior. Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
🔍 Preview links for changed docs |
1c99da3 to
e7430c9
Compare
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
…-k8s into simplify-zone-awareness
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
|
Potential pod deadlock when zone-aware NodeSets use different topology keys
However, Example configuration that triggers this (passes validation): nodeSets:
- name: hot
zoneAwareness:
topologyKey: "custom.io/rack"
- name: warm
zoneAwareness:
topologyKey: "custom.io/room"The chain is:
Possible fixes: either tighten validation to require all zone-aware NodeSets share the same topology key, or make the init script / annotation system per-NodeSet-aware. 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
nodesets. Move review changes. Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
I required the same topology key here. |
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Deal with more ordering issues causing rolling restarts Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Use given context Add some tests Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Signed-off-by: Michael Montgomery <mmontg1@gmail.com>
Summary
Resolves #3933
This PR introduces
spec.nodeSets[].zoneAwarenessas first-class API support for Elasticsearch zone awareness in ECK. The goal is to make the common zone-aware deployment path concise and safe by having the operator automatically wire:maxSkew=1,whenUnsatisfiable=DoNotSchedule),zonesare specified),ZONEenv var injection,node.attr.zoneandcluster.routing.allocation.awareness.attributesElasticsearch config defaults.This replaces the error-prone manual wiring across
config,podTemplate, and theeck.k8s.elastic.co/downward-node-labelsannotation while preserving user override behavior where appropriate.Minimal example
With explicit zone restriction
To customize
maxSkeworwhenUnsatisfiable, provide a matching topology spread constraint viapodTemplate.spec.topologySpreadConstraintswith the sametopologyKey. The operator will preserve the user-provided constraint and not inject its own default for that key.Risks
zoneAwarenesson any NodeSet changes the StatefulSet Pod template (topology spread constraints,ZONEenv var, node affinity rules,node.attr.zone/cluster.routing.allocation.awareness.attributesconfig defaults). StatefulSet template changes trigger a controlled rolling restart.zoneAwareness, all NodeSets in the cluster receive zone-related Elasticsearch config (node.attr.zone: ${ZONE},cluster.routing.allocation.awareness.attributes: k8s_node_name,zone) and theZONEenv var. This ensures shard awareness is consistent across the cluster but means non-zoneAware NodeSets also gain these settings on first adoption.What Changed
API / CRD
nodeSets[].zoneAwarenessfield (ZoneAwarenessstruct) with two optional sub-fields:topologyKey(defaults totopology.kubernetes.io/zone)zones(optional list;+listType=set,MinItems=1when specified)TopologyKeyOrDefault(),HasZoneAwareness(),ZoneAwarenessTopologyKey().DownwardNodeLabels()andDownwardNodeLabelsHashInput()to include zone-awareness-derived topology keys alongside the existing annotation-based labels. The hash input preserves the raw annotation value for backward compatibility to avoid spurious rolling restarts on operator upgrade.Scheduling
TopologySpreadConstraintwithmaxSkew=1,whenUnsatisfiable=DoNotSchedulefor the configured topology key. User-provided constraints for the same key are preserved (the operator skips its default).Inexpression whenzonesare specified, restricting placement to the listed zone values.Existsguard on the cluster topology key, ensuring pods land only on nodes that carry the label (prevents init-container deadlock from missing annotations).Elasticsearch config
clusterHasZoneAwarenessis true, the operator merges zone awareness defaults (node.attr.zone: ${ZONE},cluster.routing.allocation.awareness.attributes: k8s_node_name,zone) into the ES config for all NodeSets, before user config is applied. Explicit user config takes precedence.Init container / scripts ConfigMap
DownwardNodeLabels()now includes the zone-awareness topology key. The shared init script waits for all expected annotations (including the zone topology key annotation) before the Elasticsearch container starts. The Downward API volume and operator-side annotation patching provide the values.PodTemplateBuilderadditionsWithTopologySpreadConstraints(...): appends constraints only when no existing constraint shares the sametopologyKey. Fills in a missingLabelSelectoron an existing constraint when the provided constraint has one.WithRequiredNodeAffinityMatchExpressions(...): appends required node selector requirements by key without duplicating. ForExistsoperator, skips only when an existing expression on the same key already guarantees label existence (In,Exists,Gt,Lt).WithPreferredNodeAffinityMatchExpressions(...): appends preferred scheduling terms by key.Validation
validZoneAwarenessTopologyKeysvalidation enforces this.--exposed-node-labelswhen the flag is empty (the default). When--exposed-node-labelsis explicitly configured, zone-awareness topology keys must also be allowed by that policy. This allows zone awareness to work out of the box without requiring flag configuration.Sample manifest
config/samples/elasticsearch/elasticsearch.yamlwith commented-outzoneAwarenessexample.Assumptions and Edge-Case Policy
Non-zoneAware NodeSets when others are zoneAware
This is allowed. When at least one NodeSet enables zone awareness and another does not:
ZONEenv var,node.attr.zone, allocation awareness attributes) for cluster-wide consistency.Existsguard on the cluster topology key, ensuring pods only schedule on nodes that carry the label.Mixed topology keys across NodeSets
Not allowed. All zone-aware NodeSets must share the same topology key. This is enforced by the validating webhook. The restriction exists because the operator uses a single shared init script and annotation set for the entire cluster.
User overrides
ZONEenv var inpodTemplateis preserved.node.attr.zone,cluster.routing.allocation.awareness.attributes) takes precedence over operator defaults.Backward Compatibility
zoneAwarenessis incremental and optional.zoneAwarenesson any NodeSet triggers a one-time rolling restart of all NodeSets in the cluster due to the addition of zone-related config and env vars.DownwardNodeLabelsHashInput()method preserves the raw annotation value for hash computation, avoiding spurious rolling restarts for existing clusters that already use theeck.k8s.elastic.co/downward-node-labelsannotation withoutzoneAwareness.AI Usage
Cursor/gpt-5.3 codex used during:
Significant manual cleanup was required after AI-generated code and tests.