Replica stuck in relocation, high cpu usage

We experienced a master re-election, seemingly due to a timeout (or underlying throttling of the ec2 discovery mechanism in AWS, we're working through that with Amazon) . Regardless, the previous master got into a bad state similar to https://github.com/elasticsearch/elasticsearch/issues/3017 (excised from cluster, still reporting green health, needed a restart to join).

As a result or related to this, I'd has one replica in a relocating state for about 18 hours. I let it run to see if it would complete, but I'm wondering if it will at this point.

{
"cluster_name" : "es_cayovaprod_0001",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 19,
"number_of_data_nodes" : 6,
"active_primary_shards" : 101,
"active_shards" : 292,
"relocating_shards" : 1,
"initializing_shards" : 1,
"unassigned_shards" : 0
}

According to big desk, the cluster state for the replica/shard is -

ip-10-239-110-94: replica, relocating
ip-10-65-49-148: replica, recovering

ip-10-36-174-153: primary, green

ip-10-36-201-76: replica, green

The index in question is configured to have 2 replicas.

The first two nodes in transition show a high cpu relative to the other nodes (https://pbs.twimg.com/media/BKS1BC-CIAE5h__.png:large), according to htop they're sitting at about 33% which doesn't seem unreasonable in itself (the length of time at that level is the concern) .

The log for ip-10-239-110-94 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:18,219][WARN ][indices.cluster ] [ip-10-239-110-94] [objects_0001][22] master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}] marked shard as started, but shard have not been created, mark shard as failed
[2013-05-14 15:18:18,219][WARN ][cluster.action.shard ] [ip-10-239-110-94] sending failed shard for [objects_0001][22], node[ybnrwqQ8RPCa69L4SqE7bA], [P], s[STARTED], reason [master [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a} marked shard as started, but shard have not been created, mark shard as failed]
[2013-05-14 15:54:03,981][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,339][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,943][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,978][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,795][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,448][INFO ][cluster.service ] [ip-10-239-110-94] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,612][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

with no entries since then. The log for ip-10-65-49-148 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:03,773][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 15:18:03,773][INFO ][cluster.service ] [ip-10-65-49-148] master {new [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}, previous [ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}}, removed {[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 15:54:03,982][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,340][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,944][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,979][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,797][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,452][INFO ][cluster.service ] [ip-10-65-49-148] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,613][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

again with no entries since then.

I'm not sure how to interpret the logs, but it is possible these nodes thought they were working against two masters (or the wrong master) for a period of time?

I'm wondering -

  • is there anything I can or should do to get this relocation to complete and have the two nodes quiesce their cpu usage?

  • is it expected for relocation/recovery jobs to take a very long time, or get stuck?

  • is it expected a cluster can have two masters?

  • should I file a bug? :wink:

I'd appreciate any guidance on this, including pointers into ES code. Details -

  • 0.20.6

  • 6 data non master eligible nodes, m1.xlarge, Xms/Xmx 8G, Xmn 400M

  • 3 master eligible nodes, non data nodes, m1.medium, 2G heaps

  • minimum master nodes: 2

  • ec2 discovery

Bill

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

relocating state for about 18 hours

Which recently quiesced - https://pbs.twimg.com/media/BKTSpBfCQAAkV0V.png:large. The relocating node went green in big desk and the other cpu heavy node started shows 2 replicas relocating, but with more relaxed cpu usage. The overall cluster state is green.

Again I'm wondering if sustained utilisation of the cpu and very long recovery times are normal/expected behaviour in 0.20.6? (if nothing else, it complicates alarming :).

Bill

On Wednesday 15 May 2013 at 10:28, Bill de hÓra wrote:

We experienced a master re-election, seemingly due to a timeout (or underlying throttling of the ec2 discovery mechanism in AWS, we're working through that with Amazon) . Regardless, the previous master got into a bad state similar to Invalid internal transport message format between 0.20.6 master and client caused re-election · Issue #3017 · elastic/elasticsearch · GitHub (excised from cluster, still reporting green health, needed a restart to join).

As a result or related to this, I'd has one replica in a relocating state for about 18 hours. I let it run to see if it would complete, but I'm wondering if it will at this point.

{
"cluster_name" : "es_cayovaprod_0001",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 19,
"number_of_data_nodes" : 6,
"active_primary_shards" : 101,
"active_shards" : 292,
"relocating_shards" : 1,
"initializing_shards" : 1,
"unassigned_shards" : 0
}

According to big desk, the cluster state for the replica/shard is -

ip-10-239-110-94: replica, relocating
ip-10-65-49-148: replica, recovering

ip-10-36-174-153: primary, green

ip-10-36-201-76: replica, green

The index in question is configured to have 2 replicas.

The first two nodes in transition show a high cpu relative to the other nodes (https://pbs.twimg.com/media/BKS1BC-CIAE5h__.png:large), according to htop they're sitting at about 33% which doesn't seem unreasonable in itself (the length of time at that level is the concern) .

The log for ip-10-239-110-94 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:18,219][WARN ][indices.cluster ] [ip-10-239-110-94] [objects_0001][22] master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}] marked shard as started, but shard have not been created, mark shard as failed
[2013-05-14 15:18:18,219][WARN ][cluster.action.shard ] [ip-10-239-110-94] sending failed shard for [objects_0001][22], node[ybnrwqQ8RPCa69L4SqE7bA], [P], s[STARTED], reason [master [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a} marked shard as started, but shard have not been created, mark shard as failed]
[2013-05-14 15:54:03,981][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,339][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,943][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,978][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,795][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,448][INFO ][cluster.service ] [ip-10-239-110-94] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,612][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

with no entries since then. The log for ip-10-65-49-148 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:03,773][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 15:18:03,773][INFO ][cluster.service ] [ip-10-65-49-148] master {new [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}, previous [ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}}, removed {[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 15:54:03,982][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,340][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,944][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,979][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,797][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,452][INFO ][cluster.service ] [ip-10-65-49-148] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,613][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

again with no entries since then.

I'm not sure how to interpret the logs, but it is possible these nodes thought they were working against two masters (or the wrong master) for a period of time?

I'm wondering -

  • is there anything I can or should do to get this relocation to complete and have the two nodes quiesce their cpu usage?

  • is it expected for relocation/recovery jobs to take a very long time, or get stuck?

  • is it expected a cluster can have two masters?

  • should I file a bug? :wink:

I'd appreciate any guidance on this, including pointers into ES code. Details -

  • 0.20.6

  • 6 data non master eligible nodes, m1.xlarge, Xms/Xmx 8G, Xmn 400M

  • 3 master eligible nodes, non data nodes, m1.medium, 2G heaps

  • minimum master nodes: 2

  • ec2 discovery

Bill

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just out of curiosity did you see a big merge happening just before that?
(you might be able to identify quick index growth on disk and then rapid
drop)

simon

On Wednesday, May 15, 2013 1:04:32 PM UTC+2, Bill de hÓra wrote:

relocating state for about 18 hours

Which recently quiesced -
https://pbs.twimg.com/media/BKTSpBfCQAAkV0V.png:large. The relocating
node went green in big desk and the other cpu heavy node started shows 2
replicas relocating, but with more relaxed cpu usage. The overall cluster
state is green.

Again I'm wondering if sustained utilisation of the cpu and very long
recovery times are normal/expected behaviour in 0.20.6? (if nothing else,
it complicates alarming :).

Bill

On Wednesday 15 May 2013 at 10:28, Bill de hÓra wrote:

We experienced a master re-election, seemingly due to a timeout (or
underlying throttling of the ec2 discovery mechanism in AWS, we're working
through that with Amazon) . Regardless, the previous master got into a bad
state similar to
Invalid internal transport message format between 0.20.6 master and client caused re-election · Issue #3017 · elastic/elasticsearch · GitHub (excised from
cluster, still reporting green health, needed a restart to join).

As a result or related to this, I'd has one replica in a relocating
state for about 18 hours. I let it run to see if it would complete, but I'm
wondering if it will at this point.

{
"cluster_name" : "es_cayovaprod_0001",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 19,
"number_of_data_nodes" : 6,
"active_primary_shards" : 101,
"active_shards" : 292,
"relocating_shards" : 1,
"initializing_shards" : 1,
"unassigned_shards" : 0
}

According to big desk, the cluster state for the replica/shard is -

ip-10-239-110-94: replica, relocating
ip-10-65-49-148: replica, recovering

ip-10-36-174-153: primary, green

ip-10-36-201-76: replica, green

The index in question is configured to have 2 replicas.

The first two nodes in transition show a high cpu relative to the other
nodes (https://pbs.twimg.com/media/BKS1BC-CIAE5h__.png:large), according
to htop they're sitting at about 33% which doesn't seem unreasonable in
itself (the length of time at that level is the concern) .

The log for ip-10-239-110-94 around the time of the start of the CPU
burst -

"""
[2013-05-14 15:18:18,219][WARN ][indices.cluster ] [ip-10-239-110-94]
[objects_0001][22] master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}] marked shard as started, but shard have not
been created, mark shard as failed
[2013-05-14 15:18:18,219][WARN ][cluster.action.shard ]
[ip-10-239-110-94] sending failed shard for [objects_0001][22],
node[ybnrwqQ8RPCa69L4SqE7bA], [P], s[STARTED], reason [master
[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a} marked shard as started, but shard have not
been created, mark shard as failed]
[2013-05-14 15:54:03,981][DEBUG][discovery.zen.fd ] [ip-10-239-110-94]
[master] restarting fault detection against master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,339][DEBUG][discovery.zen.fd ] [ip-10-239-110-94]
[master] restarting fault detection against master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,943][INFO ][cluster.service ] [ip-10-239-110-94]
added
{[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,978][INFO ][cluster.service ] [ip-10-239-110-94]
added
{[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-239-110-94]
added
{[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,795][INFO ][cluster.service ] [ip-10-239-110-94]
added
{[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,448][INFO ][cluster.service ] [ip-10-239-110-94]
removed
{[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,612][INFO ][cluster.service ] [ip-10-239-110-94]
added
{[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])

"""

with no entries since then. The log for ip-10-65-49-148 around the time
of the start of the CPU burst -

"""
[2013-05-14 15:18:03,773][DEBUG][discovery.zen.fd ] [ip-10-65-49-148]
[master] restarting fault detection against master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}]]
[2013-05-14 15:18:03,773][INFO ][cluster.service ] [ip-10-65-49-148]
master {new
[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}, previous
[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}}, removed
{[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 15:54:03,982][DEBUG][discovery.zen.fd ] [ip-10-65-49-148]
[master] restarting fault detection against master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,340][DEBUG][discovery.zen.fd ] [ip-10-65-49-148]
[master] restarting fault detection against master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,944][INFO ][cluster.service ] [ip-10-65-49-148]
added
{[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,979][INFO ][cluster.service ] [ip-10-65-49-148]
added
{[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-65-49-148]
added
{[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,797][INFO ][cluster.service ] [ip-10-65-49-148]
added
{[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,452][INFO ][cluster.service ] [ip-10-65-49-148]
removed
{[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,613][INFO ][cluster.service ] [ip-10-65-49-148]
added
{[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])

"""

again with no entries since then.

I'm not sure how to interpret the logs, but it is possible these nodes
thought they were working against two masters (or the wrong master) for a
period of time?

I'm wondering -

  • is there anything I can or should do to get this relocation to
    complete and have the two nodes quiesce their cpu usage?

  • is it expected for relocation/recovery jobs to take a very long time,
    or get stuck?

  • is it expected a cluster can have two masters?

  • should I file a bug? :wink:

I'd appreciate any guidance on this, including pointers into ES code.
Details -

  • 0.20.6

  • 6 data non master eligible nodes, m1.xlarge, Xms/Xmx 8G, Xmn 400M

  • 3 master eligible nodes, non data nodes, m1.medium, 2G heaps

  • minimum master nodes: 2

  • ec2 discovery

Bill

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:> (mailto:
elasticsearch+unsubscribe@googlegroups.com <javascript:>).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Just out of curiosity did you see a big merge happening just before that?

Unsure about merge, but I can see a disk write spike a few hours before the cpu levelled out on all data nodes.

What happened since then is interesting -

ip-10-239-110-94: replica, relocating

  • This node became empty according to bigdesk, leaving 5 of 6 carrying all the data.

  • At that point I decided a shutdown/restart was in order for ip-10-239-110-94

  • The cluster it seems was starting to move data back to ip-10-239-110-94 according to bigdesk (I had executed shutdown at that point, but not restart, went ahead and performed the restart)

  • Post restart the ip-10-239-110-94 node has just 2 replicas (out of over 300 files)

  • The ip-10-239-70-202 node is still relocating two replicas. Those are targeted for ip-10-239-110-94 (they are different to the ones on the ip-10-239-110-94 node)

I'm wondering if cluster.routing.allocation.* settings has something to do with the limit on relocations that are allowed, but regardless, it's taking a long time for them to finish. Some of these files are just kB in size (it's a young index).

Bill

On Wednesday 15 May 2013 at 15:11, simonw wrote:

Just out of curiosity did you see a big merge happening just before that? (you might be able to identify quick index growth on disk and then rapid drop)

simon

On Wednesday, May 15, 2013 1:04:32 PM UTC+2, Bill de hÓra wrote:

relocating state for about 18 hours

Which recently quiesced - https://pbs.twimg.com/media/BKTSpBfCQAAkV0V.png:large. The relocating node went green in big desk and the other cpu heavy node started shows 2 replicas relocating, but with more relaxed cpu usage. The overall cluster state is green.

Again I'm wondering if sustained utilisation of the cpu and very long recovery times are normal/expected behaviour in 0.20.6? (if nothing else, it complicates alarming :).

Bill

On Wednesday 15 May 2013 at 10:28, Bill de hÓra wrote:

We experienced a master re-election, seemingly due to a timeout (or underlying throttling of the ec2 discovery mechanism in AWS, we're working through that with Amazon) . Regardless, the previous master got into a bad state similar to Invalid internal transport message format between 0.20.6 master and client caused re-election · Issue #3017 · elastic/elasticsearch · GitHub (excised from cluster, still reporting green health, needed a restart to join).

As a result or related to this, I'd has one replica in a relocating state for about 18 hours. I let it run to see if it would complete, but I'm wondering if it will at this point.

{
"cluster_name" : "es_cayovaprod_0001",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 19,
"number_of_data_nodes" : 6,
"active_primary_shards" : 101,
"active_shards" : 292,
"relocating_shards" : 1,
"initializing_shards" : 1,
"unassigned_shards" : 0
}

According to big desk, the cluster state for the replica/shard is -

ip-10-239-110-94: replica, relocating
ip-10-65-49-148: replica, recovering

ip-10-36-174-153: primary, green

ip-10-36-201-76: replica, green

The index in question is configured to have 2 replicas.

The first two nodes in transition show a high cpu relative to the other nodes (https://pbs.twimg.com/media/BKS1BC-CIAE5h__.png:large), according to htop they're sitting at about 33% which doesn't seem unreasonable in itself (the length of time at that level is the concern) .

The log for ip-10-239-110-94 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:18,219][WARN ][indices.cluster ] [ip-10-239-110-94] [objects_0001][22] master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}] marked shard as started, but shard have not been created, mark shard as failed
[2013-05-14 15:18:18,219][WARN ][cluster.action.shard ] [ip-10-239-110-94] sending failed shard for [objects_0001][22], node[ybnrwqQ8RPCa69L4SqE7bA], [P], s[STARTED], reason [master [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a} marked shard as started, but shard have not been created, mark shard as failed]
[2013-05-14 15:54:03,981][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,339][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,943][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,978][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,795][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,448][INFO ][cluster.service ] [ip-10-239-110-94] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,612][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

with no entries since then. The log for ip-10-65-49-148 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:03,773][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 15:18:03,773][INFO ][cluster.service ] [ip-10-65-49-148] master {new [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}, previous [ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}}, removed {[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 15:54:03,982][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,340][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,944][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,979][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,797][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,452][INFO ][cluster.service ] [ip-10-65-49-148] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,613][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

again with no entries since then.

I'm not sure how to interpret the logs, but it is possible these nodes thought they were working against two masters (or the wrong master) for a period of time?

I'm wondering -

  • is there anything I can or should do to get this relocation to complete and have the two nodes quiesce their cpu usage?

  • is it expected for relocation/recovery jobs to take a very long time, or get stuck?

  • is it expected a cluster can have two masters?

  • should I file a bug? :wink:

I'd appreciate any guidance on this, including pointers into ES code. Details -

  • 0.20.6

  • 6 data non master eligible nodes, m1.xlarge, Xms/Xmx 8G, Xmn 400M

  • 3 master eligible nodes, non data nodes, m1.medium, 2G heaps

  • minimum master nodes: 2

  • ec2 discovery

Bill

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com (javascript:) (mailto:elasticsearch+unsubscribe@googlegroups.com (javascript:)).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Bill

Is there anything happening in the logs?

I've seen instances on EC2 with bad disks / degraded hardware which result
in IO errors preventing replication

clint

On 15 May 2013 18:47, Bill de hÓra bill@dehora.net wrote:

Just out of curiosity did you see a big merge happening just before that?

Unsure about merge, but I can see a disk write spike a few hours before
the cpu levelled out on all data nodes.

What happened since then is interesting -

ip-10-239-110-94: replica, relocating

  • This node became empty according to bigdesk, leaving 5 of 6 carrying
    all the data.

  • At that point I decided a shutdown/restart was in order for
    ip-10-239-110-94

  • The cluster it seems was starting to move data back to ip-10-239-110-94
    according to bigdesk (I had executed shutdown at that point, but not
    restart, went ahead and performed the restart)

  • Post restart the ip-10-239-110-94 node has just 2 replicas (out of over
    300 files)

  • The ip-10-239-70-202 node is still relocating two replicas. Those are
    targeted for ip-10-239-110-94 (they are different to the ones on the
    ip-10-239-110-94 node)

I'm wondering if cluster.routing.allocation.* settings has something to do
with the limit on relocations that are allowed, but regardless, it's taking
a long time for them to finish. Some of these files are just kB in size
(it's a young index).

Bill

On Wednesday 15 May 2013 at 15:11, simonw wrote:

Just out of curiosity did you see a big merge happening just before
that? (you might be able to identify quick index growth on disk and then
rapid drop)

simon

On Wednesday, May 15, 2013 1:04:32 PM UTC+2, Bill de hÓra wrote:

relocating state for about 18 hours

Which recently quiesced -
https://pbs.twimg.com/media/BKTSpBfCQAAkV0V.png:large. The relocating
node went green in big desk and the other cpu heavy node started shows 2
replicas relocating, but with more relaxed cpu usage. The overall cluster
state is green.

Again I'm wondering if sustained utilisation of the cpu and very long
recovery times are normal/expected behaviour in 0.20.6? (if nothing else,
it complicates alarming :).

Bill

On Wednesday 15 May 2013 at 10:28, Bill de hÓra wrote:

We experienced a master re-election, seemingly due to a timeout (or
underlying throttling of the ec2 discovery mechanism in AWS, we're working
through that with Amazon) . Regardless, the previous master got into a bad
state similar to
Invalid internal transport message format between 0.20.6 master and client caused re-election · Issue #3017 · elastic/elasticsearch · GitHub (excised from
cluster, still reporting green health, needed a restart to join).

As a result or related to this, I'd has one replica in a relocating
state for about 18 hours. I let it run to see if it would complete, but I'm
wondering if it will at this point.

{
"cluster_name" : "es_cayovaprod_0001",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 19,
"number_of_data_nodes" : 6,
"active_primary_shards" : 101,
"active_shards" : 292,
"relocating_shards" : 1,
"initializing_shards" : 1,
"unassigned_shards" : 0
}

According to big desk, the cluster state for the replica/shard is -

ip-10-239-110-94: replica, relocating
ip-10-65-49-148: replica, recovering

ip-10-36-174-153: primary, green

ip-10-36-201-76: replica, green

The index in question is configured to have 2 replicas.

The first two nodes in transition show a high cpu relative to the
other nodes (https://pbs.twimg.com/media/BKS1BC-CIAE5h__.png:large),
according to htop they're sitting at about 33% which doesn't seem
unreasonable in itself (the length of time at that level is the concern) .

The log for ip-10-239-110-94 around the time of the start of the CPU
burst -

"""
[2013-05-14 15:18:18,219][WARN ][indices.cluster ]
[ip-10-239-110-94] [objects_0001][22] master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}] marked shard as started, but shard have not
been created, mark shard as failed
[2013-05-14 15:18:18,219][WARN ][cluster.action.shard ]
[ip-10-239-110-94] sending failed shard for [objects_0001][22],
node[ybnrwqQ8RPCa69L4SqE7bA], [P], s[STARTED], reason [master
[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a} marked shard as started, but shard have not
been created, mark shard as failed]
[2013-05-14 15:54:03,981][DEBUG][discovery.zen.fd ]
[ip-10-239-110-94] [master] restarting fault detection against master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,339][DEBUG][discovery.zen.fd ]
[ip-10-239-110-94] [master] restarting fault detection against master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,943][INFO ][cluster.service ]
[ip-10-239-110-94] added
{[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,978][INFO ][cluster.service ]
[ip-10-239-110-94] added
{[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ]
[ip-10-239-110-94] added
{[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,795][INFO ][cluster.service ]
[ip-10-239-110-94] added
{[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,448][INFO ][cluster.service ]
[ip-10-239-110-94] removed
{[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,612][INFO ][cluster.service ]
[ip-10-239-110-94] added
{[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])

"""

with no entries since then. The log for ip-10-65-49-148 around the
time of the start of the CPU burst -

"""
[2013-05-14 15:18:03,773][DEBUG][discovery.zen.fd ]
[ip-10-65-49-148] [master] restarting fault detection against master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}]]
[2013-05-14 15:18:03,773][INFO ][cluster.service ] [ip-10-65-49-148]
master {new
[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}, previous
[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}}, removed
{[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 15:54:03,982][DEBUG][discovery.zen.fd ]
[ip-10-65-49-148] [master] restarting fault detection against master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,340][DEBUG][discovery.zen.fd ]
[ip-10-65-49-148] [master] restarting fault detection against master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}], reason [new cluster stare received and we
monitor the wrong master
[[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,944][INFO ][cluster.service ] [ip-10-65-49-148]
added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,979][INFO ][cluster.service ] [ip-10-65-49-148]
added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-65-49-148]
added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true,
data=false},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,797][INFO ][cluster.service ] [ip-10-65-49-148]
added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,452][INFO ][cluster.service ] [ip-10-65-49-148]
removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,613][INFO ][cluster.service ] [ip-10-65-49-148]
added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false,
master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master
[[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false,
master=true, zone=eu-west-1a}])

"""

again with no entries since then.

I'm not sure how to interpret the logs, but it is possible these
nodes thought they were working against two masters (or the wrong master)
for a period of time?

I'm wondering -

  • is there anything I can or should do to get this relocation to
    complete and have the two nodes quiesce their cpu usage?

  • is it expected for relocation/recovery jobs to take a very long
    time, or get stuck?

  • is it expected a cluster can have two masters?

  • should I file a bug? :wink:

I'd appreciate any guidance on this, including pointers into ES
code. Details -

  • 0.20.6

  • 6 data non master eligible nodes, m1.xlarge, Xms/Xmx 8G, Xmn 400M

  • 3 master eligible nodes, non data nodes, m1.medium, 2G heaps

  • minimum master nodes: 2

  • ec2 discovery

Bill

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com (javascript:) (mailto:
elasticsearch+unsubscribe@googlegroups.com (javascript:)).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com (mailto:
elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Clinton,

on the node with 2 pending relocations, last logs line it

[2013-05-15 14:06:45,688][DEBUG][discovery.ec2 ] [ip-10-239-70-202] got a new state from master node, though we are already trying to rejoin the cluster

Otherwise it seems fine, the disk is ok, nothing obvious in the logs. Unclear whether I should restart this node, or do anything at all, given the cluster is green, but allocation is uneven. The ip-10-239-110-94 still only has 2 replicas (compared to 87 on this one)

Bill

On Thursday 16 May 2013 at 10:00, Clinton Gormley wrote:

Hi Bill

Is there anything happening in the logs?

I've seen instances on EC2 with bad disks / degraded hardware which result in IO errors preventing replication

clint

On 15 May 2013 18:47, Bill de hÓra <bill@dehora.net (mailto:bill@dehora.net)> wrote:

Just out of curiosity did you see a big merge happening just before that?

Unsure about merge, but I can see a disk write spike a few hours before the cpu levelled out on all data nodes.

What happened since then is interesting -

ip-10-239-110-94: replica, relocating

  • This node became empty according to bigdesk, leaving 5 of 6 carrying all the data.

  • At that point I decided a shutdown/restart was in order for ip-10-239-110-94

  • The cluster it seems was starting to move data back to ip-10-239-110-94 according to bigdesk (I had executed shutdown at that point, but not restart, went ahead and performed the restart)

  • Post restart the ip-10-239-110-94 node has just 2 replicas (out of over 300 files)

  • The ip-10-239-70-202 node is still relocating two replicas. Those are targeted for ip-10-239-110-94 (they are different to the ones on the ip-10-239-110-94 node)

I'm wondering if cluster.routing.allocation.* settings has something to do with the limit on relocations that are allowed, but regardless, it's taking a long time for them to finish. Some of these files are just kB in size (it's a young index).

Bill

On Wednesday 15 May 2013 at 15:11, simonw wrote:

Just out of curiosity did you see a big merge happening just before that? (you might be able to identify quick index growth on disk and then rapid drop)

simon

On Wednesday, May 15, 2013 1:04:32 PM UTC+2, Bill de hÓra wrote:

relocating state for about 18 hours

Which recently quiesced - https://pbs.twimg.com/media/BKTSpBfCQAAkV0V.png:large. The relocating node went green in big desk and the other cpu heavy node started shows 2 replicas relocating, but with more relaxed cpu usage. The overall cluster state is green.

Again I'm wondering if sustained utilisation of the cpu and very long recovery times are normal/expected behaviour in 0.20.6? (if nothing else, it complicates alarming :).

Bill

On Wednesday 15 May 2013 at 10:28, Bill de hÓra wrote:

We experienced a master re-election, seemingly due to a timeout (or underlying throttling of the ec2 discovery mechanism in AWS, we're working through that with Amazon) . Regardless, the previous master got into a bad state similar to Invalid internal transport message format between 0.20.6 master and client caused re-election · Issue #3017 · elastic/elasticsearch · GitHub (excised from cluster, still reporting green health, needed a restart to join).

As a result or related to this, I'd has one replica in a relocating state for about 18 hours. I let it run to see if it would complete, but I'm wondering if it will at this point.

{
"cluster_name" : "es_cayovaprod_0001",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 19,
"number_of_data_nodes" : 6,
"active_primary_shards" : 101,
"active_shards" : 292,
"relocating_shards" : 1,
"initializing_shards" : 1,
"unassigned_shards" : 0
}

According to big desk, the cluster state for the replica/shard is -

ip-10-239-110-94: replica, relocating
ip-10-65-49-148: replica, recovering

ip-10-36-174-153: primary, green

ip-10-36-201-76: replica, green

The index in question is configured to have 2 replicas.

The first two nodes in transition show a high cpu relative to the other nodes (https://pbs.twimg.com/media/BKS1BC-CIAE5h__.png:large), according to htop they're sitting at about 33% which doesn't seem unreasonable in itself (the length of time at that level is the concern) .

The log for ip-10-239-110-94 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:18,219][WARN ][indices.cluster ] [ip-10-239-110-94] [objects_0001][22] master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}] marked shard as started, but shard have not been created, mark shard as failed
[2013-05-14 15:18:18,219][WARN ][cluster.action.shard ] [ip-10-239-110-94] sending failed shard for [objects_0001][22], node[ybnrwqQ8RPCa69L4SqE7bA], [P], s[STARTED], reason [master [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a} marked shard as started, but shard have not been created, mark shard as failed]
[2013-05-14 15:54:03,981][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,339][DEBUG][discovery.zen.fd ] [ip-10-239-110-94] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,943][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,978][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,795][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,448][INFO ][cluster.service ] [ip-10-239-110-94] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,612][INFO ][cluster.service ] [ip-10-239-110-94] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

with no entries since then. The log for ip-10-65-49-148 around the time of the start of the CPU burst -

"""
[2013-05-14 15:18:03,773][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 15:18:03,773][INFO ][cluster.service ] [ip-10-65-49-148] master {new [ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}, previous [ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}}, removed {[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 15:54:03,982][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}], reason [new cluster stare received and we monitor the wrong master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}]]
[2013-05-14 16:25:38,340][DEBUG][discovery.zen.fd ] [ip-10-65-49-148] [master] restarting fault detection against master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}], reason [new cluster stare received and we monitor the wrong master [[ip-10-34-144-149][IsP0kjtRS6KJ-9R3hZehwQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c}]]
[2013-05-14 16:25:45,944][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-139-232][COIAyFiiRumWsc1h_c1uhQ][inet[/10.36.139.232:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,979][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][an52EiHdTaiFBvaxzq-KOQ][inet[/10.36.129.154:9302]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:25:45,998][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-36-129-154][VZzB17qkTneJNJLZizFtcA][inet[/10.36.129.154:9300]]{client=true, data=false},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:26:02,797][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:09,452][INFO ][cluster.service ] [ip-10-65-49-148] removed {[ip-10-34-144-149][6EbGNFdOQv6_ny2_jdPQyg][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])
[2013-05-14 16:29:46,613][INFO ][cluster.service ] [ip-10-65-49-148] added {[ip-10-34-144-149][UM03BjRfS3OmTu9QLcWneQ][inet[/10.34.144.149:9300]]{data=false, master=true, zone=eu-west-1c},}, reason: zen-disco-receive(from master [[ip-10-36-171-62][j-Gbp_DDSHqYk3e25MPf4Q][inet[/10.36.171.62:9300]]{data=false, master=true, zone=eu-west-1a}])

"""

again with no entries since then.

I'm not sure how to interpret the logs, but it is possible these nodes thought they were working against two masters (or the wrong master) for a period of time?

I'm wondering -

  • is there anything I can or should do to get this relocation to complete and have the two nodes quiesce their cpu usage?

  • is it expected for relocation/recovery jobs to take a very long time, or get stuck?

  • is it expected a cluster can have two masters?

  • should I file a bug? :wink:

I'd appreciate any guidance on this, including pointers into ES code. Details -

  • 0.20.6

  • 6 data non master eligible nodes, m1.xlarge, Xms/Xmx 8G, Xmn 400M

  • 3 master eligible nodes, non data nodes, m1.medium, 2G heaps

  • minimum master nodes: 2

  • ec2 discovery

Bill

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com (mailto:elasticsearc...@googlegroups.com) (javascript:) (mailto:elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com) (javascript:)).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com) (mailto:elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com)).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On 16 May 2013 15:15, Bill de hÓra bill@dehora.net wrote:

Otherwise it seems fine, the disk is ok, nothing obvious in the logs.
Unclear whether I should restart this node, or do anything at all, given
the cluster is green, but allocation is uneven. The ip-10-239-110-94 still
only has 2 replicas (compared to 87 on this one)

How many shards in total on each node (ie primaries and replicas)

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

How many shards in total on each node (ie primaries and replicas)

293 in total, they are organised into zones

zone a
ip-10-36-174-153: 48
ip-10-36-201-76: 49

zone b
ip-10-251-55-42: 48

ip-10-65-49-148: 52

zone c
ip-10-239-110-94: 2
ip-10-239-70-202: 94

Bill

On Thursday 16 May 2013 at 14:23, Clinton Gormley wrote:

On 16 May 2013 15:15, Bill de hÓra <bill@dehora.net (mailto:bill@dehora.net)> wrote:

Otherwise it seems fine, the disk is ok, nothing obvious in the logs. Unclear whether I should restart this node, or do anything at all, given the cluster is green, but allocation is uneven. The ip-10-239-110-94 still only has 2 replicas (compared to 87 on this one)

How many shards in total on each node (ie primaries and replicas)

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hmmm, something definitely looks wrong. The cluster.routing.allocation
settings may be the cause. Could you provide details of those?

Also, I'd consider dropping that instance and creating a new one, just in
case. See if that fixes the issue

clint

On 16 May 2013 21:52, Bill de hÓra bill@dehora.net wrote:

How many shards in total on each node (ie primaries and replicas)

293 in total, they are organised into zones

zone a
ip-10-36-174-153: 48
ip-10-36-201-76: 49

zone b
ip-10-251-55-42: 48

ip-10-65-49-148: 52

zone c
ip-10-239-110-94: 2
ip-10-239-70-202: 94

Bill

On Thursday 16 May 2013 at 14:23, Clinton Gormley wrote:

On 16 May 2013 15:15, Bill de hÓra <bill@dehora.net (mailto:
bill@dehora.net)> wrote:

Otherwise it seems fine, the disk is ok, nothing obvious in the logs.
Unclear whether I should restart this node, or do anything at all, given
the cluster is green, but allocation is uneven. The ip-10-239-110-94 still
only has 2 replicas (compared to 87 on this one)

How many shards in total on each node (ie primaries and replicas)

clint

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com (mailto:
elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

The cluster.routing.allocation settings may be the cause. Could you provide details of those?

Here you go as passed in via -Des.config -

"""
node.zone: eu-west-1a # eu-west-1b, eu-west-1c
cluster.routing.allocation.awareness.attributes: zone

"""

Otherwise, using defaults from 0.20.6. Running 'curl -XGET localhost:9200/_cluster/settings' returns an empty block

{"persistent":{},"transient":{}}

Bill

On Friday 17 May 2013 at 11:49, Clinton Gormley wrote:

Hmmm, something definitely looks wrong. The cluster.routing.allocation settings may be the cause. Could you provide details of those?

Also, I'd consider dropping that instance and creating a new one, just in case. See if that fixes the issue

clint

On 16 May 2013 21:52, Bill de hÓra <bill@dehora.net (mailto:bill@dehora.net)> wrote:

How many shards in total on each node (ie primaries and replicas)

293 in total, they are organised into zones

zone a
ip-10-36-174-153: 48
ip-10-36-201-76: 49

zone b
ip-10-251-55-42: 48

ip-10-65-49-148: 52

zone c
ip-10-239-110-94: 2
ip-10-239-70-202: 94

Bill

On Thursday 16 May 2013 at 14:23, Clinton Gormley wrote:

On 16 May 2013 15:15, Bill de hÓra <bill@dehora.net (mailto:bill@dehora.net) (mailto:bill@dehora.net)> wrote:

Otherwise it seems fine, the disk is ok, nothing obvious in the logs. Unclear whether I should restart this node, or do anything at all, given the cluster is green, but allocation is uneven. The ip-10-239-110-94 still only has 2 replicas (compared to 87 on this one)

How many shards in total on each node (ie primaries and replicas)

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com) (mailto:elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com)).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch%2Bunsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hmm... If it is trying to relocate two shards, and they are stuck, then
something is wrong.

You want to look in the logs of the source and destination nodes and find
what the problem is.

Also, as per my prev email, I'd suggest blowing away that instance and
creating a new one, to see if that fixes it.

Make sure that the problem node actually has a 'zone' attribute set - if it
doesn't then it won't have shards assigned to it

clint

On 17 May 2013 13:21, Bill de hÓra bill@dehora.net wrote:

The cluster.routing.allocation settings may be the cause. Could you
provide details of those?

Here you go as passed in via -Des.config -

"""
node.zone: eu-west-1a # eu-west-1b, eu-west-1c
cluster.routing.allocation.awareness.attributes: zone

"""

Otherwise, using defaults from 0.20.6. Running 'curl -XGET
localhost:9200/_cluster/settings' returns an empty block

{"persistent":{},"transient":{}}

Bill

On Friday 17 May 2013 at 11:49, Clinton Gormley wrote:

Hmmm, something definitely looks wrong. The cluster.routing.allocation
settings may be the cause. Could you provide details of those?

Also, I'd consider dropping that instance and creating a new one, just
in case. See if that fixes the issue

clint

On 16 May 2013 21:52, Bill de hÓra <bill@dehora.net (mailto:
bill@dehora.net)> wrote:

How many shards in total on each node (ie primaries and replicas)

293 in total, they are organised into zones

zone a
ip-10-36-174-153: 48
ip-10-36-201-76: 49

zone b
ip-10-251-55-42: 48

ip-10-65-49-148: 52

zone c
ip-10-239-110-94: 2
ip-10-239-70-202: 94

Bill

On Thursday 16 May 2013 at 14:23, Clinton Gormley wrote:

On 16 May 2013 15:15, Bill de hÓra <bill@dehora.net (mailto:
bill@dehora.net) (mailto:bill@dehora.net)> wrote:

Otherwise it seems fine, the disk is ok, nothing obvious in the
logs. Unclear whether I should restart this node, or do anything at all,
given the cluster is green, but allocation is uneven. The ip-10-239-110-94
still only has 2 replicas (compared to 87 on this one)

How many shards in total on each node (ie primaries and replicas)

clint

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscribe@googlegroups.com (mailto:
elasticsearch%2Bunsubscribe@googlegroups.com) (mailto:
elasticsearch+unsubscribe@googlegroups.com (mailto:
elasticsearch%2Bunsubscribe@googlegroups.com)).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com (mailto:
elasticsearch%2Bunsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com (mailto:
elasticsearch+unsubscribe@googlegroups.com).
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.