Elasticsearch replica shard distribution


We recently came across an issue with users reporting their queries being slow. There are 5 nodes in a cluster, index has 10 shards, replication factor is 1. Upon investigation, I found out that each of the nodes has 4 shards allocated to it, but for one node all of the shards are replicas while others contain only 1 or 2 replica shards.

As far as I understand, search queries are executed against replica shards which would explain why this node suffers the most (judging by slow queries log and kopf cluster health info).

Is there a way to adjust cluster settings in such a way that both primary and replica shards are allocated evenly? Or is there another approach that we could use in our case?

Thanks in advance for your help!

I think that is not true, according to the documentation, replicas and primaries do the same amount of work.

The cause of the slow queries must be something else. It is only one node that shows performance issue?

Hi Leandro,

You're right, my statement about query execution against replica shards seems to be false. I checked Elastic source code and found out that if preference isn't specified, random order seems to be used (IndexShardRoutingTable#activeInitializingShardsRandomIt()).

Here's what I have in slow logs for the query in question:

node 1:

[2017-08-01 14:01:06,286][WARN ][index.search.slowlog.query] [Snowfall] [index][0] took[5.2s], took_millis[5242] - primary
[2017-08-01 14:01:07,351][WARN ][index.search.slowlog.query] [Snowfall] [index][2] took[6.2s], took_millis[6293] - primary
[2017-08-01 14:01:07,374][WARN ][index.search.slowlog.query] [Snowfall] [index][1] took[6s], took_millis[6085] - primary

node 2:

[2017-08-01 14:02:23,111][WARN ][index.search.slowlog.query] [Luchino Nefaria] [index][7] took[43.2s], took_millis[43201] - replica
[2017-08-01 14:02:49,445][WARN ][index.search.slowlog.query] [Luchino Nefaria] [index][9] took[1.7m], took_millis[106090] - replica
[2017-08-01 14:02:54,152][WARN ][index.search.slowlog.query] [Luchino Nefaria] [index][3] took[1.8m], took_millis[113128] - replica
[2017-08-01 14:03:27,645][WARN ][index.search.slowlog.query] [Luchino Nefaria] [index][5] took[2.4m], took_millis[144610] - replica

node 3: nothing

node 4:

[2017-08-01 14:01:17,866][WARN ][index.search.slowlog.query] [Xemnu the Titan] [index][4] took[16.7s], took_millis[16747] - primary
[2017-08-01 14:01:21,942][WARN ][index.search.slowlog.query] [Xemnu the Titan] [index][6] took[20.8s], took_millis[20837] - primary

node 5:

[2017-08-01 14:01:05,856][WARN ][index.search.slowlog.query] [Gideon] [index][8] took[4.8s], took_millis[4815] - replica

As you can see, node 2 had the most work to do. Although node 1 had a lot of work too but managed to complete it much faster.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.