Skip to content

Memberlist: Options to control handling of dead nodes and name reuse #2131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
* `-server.grpc.keepalive.timeout`
* [ENHANCEMENT] PostgreSQL: Bump up `github.com/lib/pq` from `v1.0.0` to `v1.3.0` to support PostgreSQL SCRAM-SHA-256 authentication. #2097
* [ENHANCEMENT] Casandra: User no longer need `CREATE` privilege on `<all keyspaces>` if given keyspace exists. #2032
* [ENHANCEMENT] Experimental Memberlist KV: expose `-memberlist.gossip-to-dead-nodes-time` and `-memberlist.dead-node-reclaim-time` options to control how memberlist library handles dead nodes and name reuse. #2131
* [BUGFIX] Alertmanager: fixed panic upon applying a new config, caused by duplicate metrics registration in the `NewPipelineBuilder` function. #211
* [BUGFIX] Experimental TSDB: fixed `/all_user_stats` and `/api/prom/user_stats` endpoints when using the experimental TSDB blocks storage. #2042
* [BUGFIX] Experimental TSDB: fixed ruler to correctly work with the experimental TSDB blocks storage. #2101
Expand Down
7 changes: 6 additions & 1 deletion docs/configuration/arguments.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,12 @@ Flags for configuring KV store based on memberlist library. This feature is expe
Timeout for writing 'packet' data.
- `memberlist.transport-debug`
Log debug transport messages. Note: global log.level must be at debug level as well.

- `memberlist.gossip-to-dead-nodes-time`
How long to keep gossiping to the nodes that seem to be dead. After this time, dead node is removed from list of nodes. If "dead" node appears again, it will simply join the cluster again, if its name is not reused by other node in the meantime. If the name has been reused, such a reanimated node will be ignored by other members.
- `memberlist.dead-node-reclaim-time`
How soon can dead's node name be reused by a new node (using different IP). Disabled by default, name reclaim is not allowed until `gossip-to-dead-nodes-time` expires. This can be useful to set to low numbers when reusing node names, eg. in stateful sets.
If memberlist library detects that new node is trying to reuse the name of previous node, it will log message like this: `Conflicting address for ingester-6. Mine: 10.44.12.251:7946 Theirs: 10.44.12.54:7946 Old state: 2`. Node states are: "alive" = 0, "suspect" = 1 (doesn't respond, will be marked as dead if it doesn't respond), "dead" = 2.

#### Multi KV

This is a special key-value implementation that uses two different KV stores (eg. consul, etcd or memberlist). One of them is always marked as primary, and all reads and writes go to primary store. Other one, secondary, is only used for writes. The idea is that operator can use multi KV store to migrate from primary to secondary store in runtime.
Expand Down
10 changes: 10 additions & 0 deletions docs/configuration/config-file-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -1756,6 +1756,16 @@ The `memberlist_config` configures the Gossip memberlist.
# CLI flag: -memberlist.gossip-nodes
[gossip_nodes: <int> | default = 0]

# How long to keep gossiping to dead nodes, to give them chance to refute their
# death. Uses memberlist LAN defaults if 0.
# CLI flag: -memberlist.gossip-to-dead-nodes-time
[gossip_to_dead_nodes_time: <duration> | default = 0s]

# How soon can dead node's name be reclaimed with new address. Defaults to 0,
# which is disabled.
# CLI flag: -memberlist.dead-node-reclaim-time
[dead_node_reclaim_time: <duration> | default = 0s]

# Other cluster members to join. Can be specified multiple times. Memberlist
# store is EXPERIMENTAL.
# CLI flag: -memberlist.join
Expand Down
22 changes: 16 additions & 6 deletions pkg/ring/kv/memberlist/memberlist_client.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,14 @@ func (c *Client) WatchPrefix(ctx context.Context, prefix string, f func(string,
// KVConfig is a config for memberlist.KV
type KVConfig struct {
// Memberlist options.
NodeName string `yaml:"node_name"`
StreamTimeout time.Duration `yaml:"stream_timeout"`
RetransmitMult int `yaml:"retransmit_factor"`
PushPullInterval time.Duration `yaml:"pull_push_interval"`
GossipInterval time.Duration `yaml:"gossip_interval"`
GossipNodes int `yaml:"gossip_nodes"`
NodeName string `yaml:"node_name"`
StreamTimeout time.Duration `yaml:"stream_timeout"`
RetransmitMult int `yaml:"retransmit_factor"`
PushPullInterval time.Duration `yaml:"pull_push_interval"`
GossipInterval time.Duration `yaml:"gossip_interval"`
GossipNodes int `yaml:"gossip_nodes"`
GossipToTheDeadTime time.Duration `yaml:"gossip_to_dead_nodes_time"`
DeadNodeReclaimTime time.Duration `yaml:"dead_node_reclaim_time"`

// List of members to join
JoinMembers flagext.StringSlice `yaml:"join_members"`
Expand Down Expand Up @@ -110,6 +112,8 @@ func (cfg *KVConfig) RegisterFlags(f *flag.FlagSet, prefix string) {
f.DurationVar(&cfg.GossipInterval, prefix+"memberlist.gossip-interval", 0, "How often to gossip. Uses memberlist LAN defaults if 0.")
f.IntVar(&cfg.GossipNodes, prefix+"memberlist.gossip-nodes", 0, "How many nodes to gossip to. Uses memberlist LAN defaults if 0.")
f.DurationVar(&cfg.PushPullInterval, prefix+"memberlist.pullpush-interval", 0, "How often to use pull/push sync. Uses memberlist LAN defaults if 0.")
f.DurationVar(&cfg.GossipToTheDeadTime, prefix+"memberlist.gossip-to-dead-nodes-time", 0, "How long to keep gossiping to dead nodes, to give them chance to refute their death. Uses memberlist LAN defaults if 0.")
f.DurationVar(&cfg.DeadNodeReclaimTime, prefix+"memberlist.dead-node-reclaim-time", 0, "How soon can dead node's name be reclaimed with new address. Defaults to 0, which is disabled.")

cfg.TCPTransport.RegisterFlags(f, prefix)
}
Expand Down Expand Up @@ -217,6 +221,12 @@ func NewKV(cfg KVConfig) (*KV, error) {
if cfg.GossipNodes != 0 {
mlCfg.GossipNodes = cfg.GossipNodes
}
if cfg.GossipToTheDeadTime > 0 {
mlCfg.GossipToTheDeadTime = cfg.GossipToTheDeadTime
}
if cfg.DeadNodeReclaimTime > 0 {
mlCfg.DeadNodeReclaimTime = cfg.DeadNodeReclaimTime
}
if cfg.NodeName != "" {
mlCfg.Name = cfg.NodeName
}
Expand Down