Unable to fully utilize resources in Elasticsearch cluster

Srinath_C · July 25, 2015, 2:09am

Hi,

I'm running tests to size up a cluster and get the best tuning parameters for my requirements - Index documents at the rate of 10k per second (each document being ~2k in size) across 10 indices, each having 3 shards each with async replication of 1

There is a 4 node cluster of c3.xlarge instances ES 1.6.0 with data stored on 2 instance store SSD drives. I'm generating bulk indexing requests at from a storm cluster of 3 nodes using NodeClient. So far I have been able to reach upto 2.5k but I seem to have reached a point where I can't figure out the bottleneck.

At a rate of 2.5k per second the cpu utilization on the nodes is only around 25%. But any increase in document indexing rate results in EsRejectedExecutionException (TransportShardReplicationOperationAction$PrimaryPhase).

The logs seem to suggest a lot of GC activity, but I can't seem to get better of this cluster at this point of time. Any help will be appreciated.

Srinath_C · July 25, 2015, 2:31am

Here are some more info to troubleshoot:
iostat of all nodes
hot threads - node 1
hot threads - node 2
hot threads - node 3
hot threads - node 4

Let me know if anything else would help in giving pointers.

Christian_Dahlqvist · July 25, 2015, 7:03am

Can you please share some additional information about your cluster configuration, e.g. heap size and any non-default settings? What bulk size are you using? What type oaf data are you indexing? In addition to the indexing, how much querying is going on at the same time?

Srinath_C · July 25, 2015, 10:35am

Hi Christian,

Thanks for looking into this. I have uploaded the elasticsearch.yml.

Here are the JVM args used:
-server -Djava.net.preferIPv4Stack=true
-Xms4479m -Xmx4479m -Xss256k -XX:NewRatio=1
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError

There are three NodeClients using BulkRequestBuilder and configured to do bulk index for 10000 or in 10 seconds whichever is earlier. Data is being pushed in the form of JSON, (most of them have inner documents too).

The current tests are mostly checking the indexing rate hence queries are very minimal ...(<10s per minute). But we do plan to have around 10 queries (mostly aggregations over 1 hour or 1 day) per second later on.

Topic		Replies	Views
Index Dimensioning and Optimization (across the Cluster) Elasticsearch	6	376	March 24, 2021
Bulk Indexing Rate Elasticsearch	4	553	April 18, 2018
Elasticsearch sizing Elasticsearch	3	545	January 27, 2018
Cluster resource usage Elasticsearch	14	432	July 6, 2017
Degraded Indexing Performance on v7.3.1 (from v5.6.10) Elasticsearch	6	406	March 27, 2020

Unable to fully utilize resources in Elasticsearch cluster

Related topics