Data node high CPU


(Itay Bittan) #1

Hi Guys (@Christian_Dahlqvist / @Igor_Motov),

Setup:
Elasticsearch 1.4.7 (legacy)
3 master nodes - 2GB RAM, 1vCPUs.
4 data nodes - 30GB RAM, 8 vCPUs, EBS 500GB SSD two different availability zones.
2 client nodes - 15GB RAM, 8vCPUs, two different availability zones.

We have ~750 indices. Each once contains between 10K-3M documents.
One shard per each index, replicated 3 times (so each data nodes hold one shard replica).
The cluster should handle ~500 requests per second.

We are having a CPU load issues on our data node machines.
Any idea why does it happen?
Any idea how to utilize this setup?

P.S - we found in elasticsearch.yml the legacy configuration:

indices.fielddata.cache.size: 10%
indices.memory.index_buffer_size: 15%
indices.cache.filter.size: 10% 

also, in all instances,ES_HEAP_SIZE is set to 8gb

Thanks,
Itay


(Christian Dahlqvist) #2

You have a lot of shards per node considering the amount of heap you have allocated. Any indication through logs or monitoring that you are suffering from heap pressure and extensive GC? What does disk I/O and iowait look like? What does the hot threads API give?


(Itay Bittan) #3

Thanks,
I'll check and come back with more information.

Just to clarify, these machines are dedicated only for that propose.
What would you recommend for the ES_HEAP_SIZE? 31gb?
and what about the cache? I saw here 75%

Another question: This cluster is currently serving. Can I change the configuration while serving? stopping each data node at a time, increase it heap size and start it again?
or should I restart all the cluster at the same time?


(Itay Bittan) #4

::: [data-srv-rcom-es-node-29bbe67][PSFf4qIRSGq9nGvwb16qEQ][data-srv-rcom-es-node-29bbe67.use.dy.com][inet[/172.30.0.95:9300]]{availability_zone=us-east-1a, aws_availability_zone=us-east-1a, max_local_storage_nodes=1, zone=SERVER, master=false}

60.7% (303.7ms out of 500ms) cpu usage by thread 'elasticsearch[data-srv-rcom-es-node-29bbe67][search][T#7]'
10/10 snapshots sharing following 2 elements
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
java.lang.Thread.run(Thread.java:748)

31.0% (155.1ms out of 500ms) cpu usage by thread 'elasticsearch[data-srv-rcom-es-node-29bbe67][search][T#13]'
10/10 snapshots sharing following 10 elements
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:735)
java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:644)
java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1137)
org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:162)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1075)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
java.lang.Thread.run(Thread.java:748)

7.0% (35.2ms out of 500ms) cpu usage by thread 'elasticsearch[data-srv-rcom-es-node-29bbe67][search][T#4]'
 10/10 snapshots sharing following 10 elements
   sun.misc.Unsafe.park(Native Method)
   java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
   java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:735)
   java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:644)
   java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1137)
   org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:162)
   java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1075)
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
   java.lang.Thread.run(Thread.java:748)

::: [master-rcom-es-node-9ce2daf][Rvt86XPWTQazt4C0e6kOlg][master-rcom-es-node-9ce2daf.use.dy.com][inet[/172.30.1.123:9300]]{aws_availability_zone=us-east-1b, data=false, max_local_storage_nodes=1, master=true}

0.1% (296.1micros out of 500ms) cpu usage by thread 'elasticsearch[master-rcom-es-node-9ce2daf][[transport_server_worker.default]][T#2]{New I/O worker #5}'
 10/10 snapshots sharing following 15 elements
   sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
   sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
   sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
   sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
   org.elasticsearch.common.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:68)
   org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:415)
   org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:212)
   org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
   org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
   org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
   java.lang.Thread.run(Thread.java:748)

0.0% (109.8micros out of 500ms) cpu usage by thread 'elasticsearch[master-rcom-es-node-9ce2daf][transport_client_timer][T#1]{Hashed wheel timer #1}'
 10/10 snapshots sharing following 5 elements
   java.lang.Thread.sleep(Native Method)
   org.elasticsearch.common.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:445)
   org.elasticsearch.common.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:364)
   org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   java.lang.Thread.run(Thread.java:748)

0.0% (89.7micros out of 500ms) cpu usage by thread 'elasticsearch[master-rcom-es-node-9ce2daf][[timer]]'
 10/10 snapshots sharing following 2 elements
   java.lang.Thread.sleep(Native Method)
   org.elasticsearch.threadpool.ThreadPool$EstimatedTimeThread.run(ThreadPool.java:568)

::: [data-wrk-rcom-es-node-36448bf][xyTt6RnTQwWXyxKZedZDmg][data-wrk-rcom-es-node-36448bf.use.dy.com][inet[/172.30.0.27:9300]]{availability_zone=us-east-1a, aws_availability_zone=us-east-1a, max_local_storage_nodes=1, zone=WORKER, master=false}

37.4% (186.8ms out of 500ms) cpu usage by thread 'elasticsearch[data-wrk-rcom-es-node-36448bf][bulk][T#2]'
10/10 snapshots sharing following 10 elements
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:735)
java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:644)
java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1137)
org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:162)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1075)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
java.lang.Thread.run(Thread.java:748)

32.3% (161.4ms out of 500ms) cpu usage by thread 'elasticsearch[data-wrk-rcom-es-node-36448bf][bulk][T#3]'
10/10 snapshots sharing following 10 elements
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:735)
java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:644)
java.util.concurrent.


(Itay Bittan) #5

adding screenshot from bigdesk:


(Itay Bittan) #6


(Christian Dahlqvist) #7

It looks like it is primarily busy with search and indexing load. Has the CPU load changed lately or is this something that has grown over time? What type of queries are you running? How many shards do there queries typically address? Has the request and/or indexing volume increased over time? Has the total data volume in the cluster changed?


(Itay Bittan) #8

We had cluster of 6 nodes, with no role separation. Each one of them was client & data & master.
We separate to roles after reading some posts of "elasticsearch in production". We also separate our cluster to two zones (I described only the serving zone) so we have also working zone (we we bake our indices). Nothing was change from the serving perspective, same load, same queries, only nodes roles separation.
Another things that were changed: the shard number per index which was 4 and now it's 1, replication was 5 and now it 3.

There is indexing operation in serving zone. We reallocate index once it's done (from working zone to serve zone) and flip alias to the new index.


(Christian Dahlqvist) #9

Did you perform indexing into the serving nodes before as well? If not, indexing can use a good amount of CPU, so it is possible that you may need to scale out to accommodate this extra load.


(Itay Bittan) #10

No.
We had two different clusters before and we used snapshots to move indices between clusters. We took your advice and start working with one cluster, separate into two zones: worker & server. Each zone has its own client nodes and it's own data nodes. Only master nodes are shared (between zones). Thus, I'm sure that only nodes of the worker zone are indexing.


(Christian Dahlqvist) #11

Is there any difference in query and/or indexing load on the data nodes in the 'serving' zone compared to when there were separate clusters? Has the indexing or query volume increased? Has data volume increased?


(Itay Bittan) #12

Same queries.
Same data volume per data node, except the shard partition (currently 1 shard per index, 4 shards per index before).
In the new cluster indices arrive via reallocation api (from worker zone), in old cluster indices arrive via snapshot api (entire index each time).


(Itay Bittan) #13

Another change is that I'm using the optimize API for merging the index segments (to1 single segment) before moving it to serve zone. What are the disadvantages for this approach? do you think it might be related to the problem? we see improve of 20% in latency, not sure if it because of that, the node roles separation, the shards number (1 instead of 4) or something else.


(Itay Bittan) #14

@Christian_Dahlqvist by the way, we doesn't have aggregate queries at all.


(Christian Dahlqvist) #15

How much has CPU usage increased?


(Itay Bittan) #16

It reached 100% usage


(Christian Dahlqvist) #17

How high was it before the changes?


(Itay Bittan) #18

not sure, but it was stable


(Christian Dahlqvist) #19

I unfortunately have limited knowledge about the exact state before and after the change, and it is hard for me to tell what could have changed and what potential impact it has had.


(system) #20

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.