Allow missing shard stats for restarted nodes for _snapshot/_status#128399
Allow missing shard stats for restarted nodes for _snapshot/_status#128399JeremyDahlgren merged 19 commits intoelastic:mainfrom
_snapshot/_status#128399Conversation
Returns an empty shard stats for shard entries where stats were unavailable in the case where a node has been restarted or left the cluster. The change adds a 'description' field to the SnapshotIndexShardStatus class that is used to include a message indicating why the stats are empty. This change was motivated by a desire to reduce latency for getting the stats for currently running snapshots. The stats can still be loaded from the repository via a _snapshot/<repository>/snapshot/_status call. Closes ES-10982
...elasticsearch/action/admin/cluster/snapshots/status/TransportSnapshotsStatusActionTests.java
Outdated
Show resolved
Hide resolved
...elasticsearch/action/admin/cluster/snapshots/status/TransportSnapshotsStatusActionTests.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/snapshots/SnapshotStatusApisIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/snapshots/SnapshotStatusApisIT.java
Outdated
Show resolved
Hide resolved
.../org/elasticsearch/action/admin/cluster/snapshots/status/TransportSnapshotsStatusAction.java
Outdated
Show resolved
Hide resolved
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
|
Hi @JeremyDahlgren, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Looks good. My only remaining concern is the to/fromXContent code. I never got that familiar with the logic. I think that it might be OK as is, since it's just reading and writing values. But the testing appears to be in SnapshotStatsTests via the createTestInstance() implementation and extending AbstractXContentTestCase. I wonder what would happen if the random number generation in createTestInstance() were extended to -1? For reference I was looking at the original commit that added the XContent code
...n/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotIndexShardStatus.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotStats.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/snapshots/SnapshotStatusApisIT.java
Outdated
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/elasticsearch/snapshots/SnapshotStatusApisIT.java
Show resolved
Hide resolved
I added an explicit test case for |
DiannaHohensee
left a comment
There was a problem hiding this comment.
I added an explicit test case for XContent round trip serialization for the missing stats, which failed initially and I added a check for in the deserialization. Thanks for raising this. I also added an explicit XContent serialization test for the missing stats static method in SnapshotIndexShardStatus. The random number generation in createTestInstance() seems ok with the zero lower bound for normal values, while we use the -1 values as a special case for "missing" stats.
Clever test extension, I like it. Yeah, your solution makes more sense, to test the particular set of missingStats values, not randomize -1 values with a bunch of other numbers, which wouldn't happen.
My last question is about those zeros you found in the fromXContent method, wondering where they got set (or the absence of being set, really) that way.
...n/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotIndexShardStatus.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotStats.java
Outdated
Show resolved
Hide resolved
...n/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotIndexShardStatus.java
Outdated
Show resolved
Hide resolved
...n/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotIndexShardStatus.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotStats.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/action/admin/cluster/snapshots/status/SnapshotStats.java
Outdated
Show resolved
Hide resolved
.../org/elasticsearch/action/admin/cluster/snapshots/status/TransportSnapshotsStatusAction.java
Outdated
Show resolved
Hide resolved
…apshots/status/SnapshotStats.java Adjust forMissingStats() javadoc Co-authored-by: Dianna Hohensee <artemisapple@gmail.com>
DiannaHohensee
left a comment
There was a problem hiding this comment.
LGTM, without the pre-existing behavior fix in b1fc83c
|
still lgtm 👍 |
…lastic#128399) Returns an empty shard stats for shard entries where stats were unavailable in the case where a node has been restarted or left the cluster. The change adds a 'description' field to the SnapshotIndexShardStatus class that is used to include a message indicating why the stats are empty. This change was motivated by a desire to reduce latency for getting the stats for currently running snapshots. The stats can still be loaded from the repository via a _snapshot/<repository>/snapshot/_status call. Closes ES-10982 Co-authored-by: Dianna Hohensee <artemisapple@gmail.com>
…lastic#128399) Returns an empty shard stats for shard entries where stats were unavailable in the case where a node has been restarted or left the cluster. The change adds a 'description' field to the SnapshotIndexShardStatus class that is used to include a message indicating why the stats are empty. This change was motivated by a desire to reduce latency for getting the stats for currently running snapshots. The stats can still be loaded from the repository via a _snapshot/<repository>/snapshot/_status call. Closes ES-10982 Co-authored-by: Dianna Hohensee <artemisapple@gmail.com>
Returns an empty shard stats for shard entries where stats were unavailable in the case where a node has been restarted or left the cluster. The change adds a
descriptionfield to theSnapshotIndexShardStatusclass that is used to include a message indicating why the stats are empty. To reduce confusion when stats are empty the stats values will all be-1except for the time values which will be zero. This is in addition to the newdescriptionfield.This change was motivated by a desire to reduce latency for getting the stats for currently running snapshots. The stats can still be loaded later from the repository via a
_snapshot/<repository>/snapshot/_statuscall when the snapshot has completed.An API docs update is in elasticsearch-specification PR #4410.
Closes ES-10982