Skip to content

Add settings for health indicator shard_capacity thresholds#136141

Merged
samxbr merged 30 commits intoelastic:mainfrom
samxbr:feature/health-shard-capacity-settings
Oct 30, 2025
Merged

Add settings for health indicator shard_capacity thresholds#136141
samxbr merged 30 commits intoelastic:mainfrom
samxbr:feature/health-shard-capacity-settings

Conversation

@samxbr
Copy link
Contributor

@samxbr samxbr commented Oct 7, 2025

Add dynamic settings to allow user to configure the unhealthy thresholds (yellow/red) of shard capacity health indicator. They replace the current hard-coded values:

  • health.shard_capacity.unhealthy_threshold.yellow (default 10)
  • health.shard_capacity.unhealthy_threshold.red (default 5)

Behavior:

  • both thresholds need to be positive integers
  • RED threshold needs to be smaller than YELLOW threshold, so the health indicator follows the order from GREEN -> YELLOW -> RED

Closes #116697

@samxbr samxbr added the :Distributed/Health Issues for the health report API label Oct 7, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Oct 8, 2025

@github-actions
Copy link
Contributor

github-actions bot commented Oct 8, 2025

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@samxbr samxbr requested a review from Copilot October 8, 2025 23:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces configurable thresholds for the shard capacity health indicator, allowing administrators to customize when the indicator reports YELLOW and RED statuses. Previously, these thresholds were hardcoded to 10 and 5 remaining shards respectively.

  • Adds two new dynamic cluster settings for configuring YELLOW and RED thresholds
  • Updates the ShardsCapacityHealthIndicatorService to use configurable thresholds instead of hardcoded values
  • Includes comprehensive test coverage and documentation for the new settings

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ShardsCapacityHealthIndicatorService.java Adds configurable threshold settings with validation and dynamic updates
ShardsCapacityHealthIndicatorServiceTests.java Updates all tests to use the new constructor and adds threshold validation tests
NodeConstruction.java Updates service instantiation to pass settings parameter
ClusterSettings.java Registers the new settings with cluster configuration
HealthFeatures.java New feature specification for the settings
module-info.java Exports the new HealthFeatures class
META-INF/services/org.elasticsearch.features.FeatureSpecification Registers the HealthFeatures service
health-diagnostic-settings.md Documents the new configuration settings
30_feature.yml Adds integration tests for the new settings functionality

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

samxbr and others added 3 commits October 8, 2025 20:08
…ityHealthIndicatorService.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@elasticsearchmachine
Copy link
Collaborator

Hi @samxbr, I've created a changelog YAML for you.

@samxbr samxbr marked this pull request as ready for review October 9, 2025 15:02
@samxbr samxbr requested a review from a team as a code owner October 9, 2025 15:03
@samxbr samxbr requested a review from a team October 9, 2025 15:15
@elasticsearchmachine elasticsearchmachine added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label Oct 9, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Contributor

@seanzatzdev seanzatzdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@nielsbauman nielsbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one more small cleanup comment, but other than that LGTM.
Thanks a lot for working and iterating on this!

}

static HealthIndicatorDetails buildDetails(List<ShardLimitValidator.Result> results) {
static HealthIndicatorDetails buildDetails(List<ShardLimitValidator.Result> results, HealthMetadata healthMetadata) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this healthMetadata anymore.

@samxbr samxbr merged commit b3ebfd1 into elastic:main Oct 30, 2025
34 checks passed
@samxbr samxbr deleted the feature/health-shard-capacity-settings branch October 30, 2025 13:30
chrisparrinello pushed a commit to chrisparrinello/elasticsearch that referenced this pull request Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Health Issues for the health report API >enhancement Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v9.3.0

5 participants