All load is being concentrated on one node?

stuayre · September 30, 2018, 9:20pm

I have a problem with elastic search, all the load is concentrated on just one node, if I add new nodes they just sit there doing nothing.
How can I make elastic search distribute the load around all the nodes on the cluster?

I'm running a 3 node cluster

The cluster is just serving search requests its not injesting or indexing data.
the biggest index is 100 million records
I'm searching product titles for keywords. nothing complex.

here's the output from:

curl localhost:9200/_cat/nodes?v

ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
xxx.xxx.xxx.48            9          99  21   39.54   37.08    31.97 mdi       -      ES2
xxx.xxx.xxx.77           11          99  40    9.82   11.21    11.24 mdi       *      ES1
xxx.xxx.xxx.223          11          99  16    2.75    4.05     4.04 di        -      ES3

as you can see most of the load is on ES2 while ES3 is doing nothing.
I'm sending all the search requests to ES3 in the hope that it would take over some of the work.

All these servers have: 24 cpus, 65gb ram

running: ES 6.4.1

I basically installed Elastic Search from scratch,
change the /etc/elasticsearch/jvm.options to -Xms24g -Xmx24g
set discovery.zen.minimum_master_nodes: 2
all nodes are data nodes, with two masters
I hooked up the nodes to the cluster as normal.

I've tried changing the number of primary shards from 5 to 10 with 1 replica but it doesn't make any difference.
I've tried restarting the servers / Elastic Search multiple times.
The cluster status is green.

i'm lost as what to try next!

Christian_Dahlqvist · October 1, 2018, 6:10am

Are the shards evenly distributed across the nodes? Are all of the nodes sharing the same specification and configuration?

stuayre · October 1, 2018, 6:19am

Yes all shards are evenly distributed

all nodes are exactly the same with the same config and Os / memory / cpu etc...

Christian_Dahlqvist · October 1, 2018, 6:22am

Are you hitting all indices evenly? Are you using some feature that could cause an imbalanced load, e.g. custom routing or parent-child?

stuayre · October 1, 2018, 6:26am

I expect some indices are reciving a lot more search requests than others.

I haven't set up any custom routing or parent child features

Christian_Dahlqvist · October 1, 2018, 6:29am

Are you querying using preference, which would cause the same shards to be queried? Are the shards for most frequently queried indices evenly distributed?

stuayre · October 1, 2018, 6:37am

this is the search query i'm using..

_search?q=title:$keyword&from=$start&size=50

this is the shard distribution on the busy index..

Christian_Dahlqvist · October 1, 2018, 6:38am

What is the output of the hot threads API on the busy node?

Although it is not related, I would also recommend making the third node master-eligible as well. You always want a minimum of 3 master eligible nodes in a cluster.

stuayre · October 1, 2018, 6:48am

Ok thanks will do.. btw thank you for your help with this!

I ran this command...

_nodes/ES2/hot_threads

it came back with...

https://pastebin.com/8yCfeNsY

(too much to paste here)

stuayre · October 1, 2018, 6:58am

here's an example of the load..

blue = es2
green = es1
purple = es3

load

Christian_Dahlqvist · October 1, 2018, 7:04am

I can not see any reason for this unless one of the nodes is misconfigured, there is some hardware issue or you are using one of the features I mentioned. What does disk I/O look like on the different nodes? Is there anything in the Elasticsearch logs?

stuayre · October 1, 2018, 7:10am

here's the disk stats...

black = es2
purple = es1
blue = es3

I'll check the logs..

Christian_Dahlqvist · October 1, 2018, 7:22am

Also check if you are having any problems with the disks.

stuayre · October 1, 2018, 8:21am

I can't see anything in the log file

stuayre · October 1, 2018, 8:56am

I found a similar issue here, could it be the version of Java? i'm using the Open version

openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

stuayre · October 1, 2018, 9:29pm

I switched to the Oracle Java SDK on all 3 servers and restarted Elastic Search but it made no difference...

ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
xxx.xxx.xxx.223          12          99  10    2.91    3.28     2.19 mdi       -      ES3
xxx.xxx.xxx.48           11          99  16   39.08   35.10    20.84 mdi       -      ES2
xxx.xxx.xxx.77           13          99  15    9.01    9.31     5.75 mdi       *      ES1

So I shut down ES2 to see what would happen...

ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
xxx.xxx.xxx.223            12          99  28    9.80    8.25     5.25 mdi       -      ES3
xxx.xxx.xxx.77           13          99  62   39.81   31.41    17.69 mdi       *      ES1

All the load jumped to ES1!

so I shut down ES1....

ip            heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
xxx.xxx.xxx.223           11          99  28   10.14   11.29     9.00 mdi       *      ES3

Now everything is running fine and stable just on a one node cluster (ES3)!

why does adding another node increase the load on that extra node?

any ideas?

siddaram_kj · October 5, 2018, 9:43am

Can you check in which node the shards of the index to which you are indexing the data are allocated?

system · November 2, 2018, 9:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Balancing searches on a cluster Elasticsearch	3	1853	October 25, 2017
Load doesnt spread even on all nodes in cluster Elasticsearch	2	896	July 6, 2017
Load is not distributed across the nodes Elasticsearch	12	967	June 3, 2020
Uneven search load on data nodes Elasticsearch	1	523	July 6, 2017
Unbalanced CPU load on every time Elasticsearch	4	1190	November 14, 2017

All load is being concentrated on one node?

Related topics