Elasticsearch 1.5.2 deployment issue

radap · April 12, 2016, 5:25pm

I have ES 1.5.2 cluster with the following specs:

3 nodes with RAM: 32GB, CPU cores: 8 each
282 total indices
2,564 total shards
799,505,935 total docs
767.84GB total data
ES_HEAP_SIZE=16g

The problem is when I am using Kibana to query some thing (very simple queries), if it a single query it`s working fine, but if I continue to query some more - elastic is getting so slow and eventually stuck because the JVM heap usage (from Marvel) is getting to 87-95%. It happens also when I trying to load some Kibana dashboard and the only solution for this situation is to restart the service on all the nodes.

(This is also happens on ES 2.2.0 with Kibana 4)

What is wrong, what am I missing?
Am I suppose to query less?

EDIT:

I had to mention that I have a lot of empty indices (0 documents) but the shards are counted. This is this way because I set ttl on the documents for 4w, and the empty indices will be deleted with curator.

Also we have not disabled doc_values in 1.5.2 nor 2.2.0 clusters.
The accurate specs are as following (1.5.2):

3 nodes with RAM: 32GB, CPU cores: 8 each
282 total indices = 227 empty + 31 marvel + 1 kibana + 23 data
2,564 total shards = (1135 empty + 31 marvel + 1 kibana + 115 data)* 1 replica
799,505,935 total docs
767.84GB total data
ES_HEAP_SIZE=16g

curl _cat/fielddata?v result:

2.2.0:

 total os.cpu.usage primaries.indexing.index_total total.fielddata.memory_size_in_bytes jvm.mem.heap_used_percent jvm.gc.collectors.young.collection_time_in_millis primaries.docs.count device.imei fs.total.available_in_bytes os.load_average.1m index.raw @timestamp node.ip_port.raw fs.total.disk_io_op node.name jvm.mem.heap_used_in_bytes jvm.gc.collectors.old.collection_time_in_millis total.merges.total_size_in_bytes jvm.gc.collectors.young.collection_count jvm.gc.collectors.old.collection_count total.search.query_total 
 2.1gb        1.2mb                          3.5mb                                3.4mb                     1.1mb                                                0b                3.5mb       2.1gb                       1.9mb              1.8mb     3.6mb      3.6mb            1.7mb               1.9mb     1.7mb                      1.6mb                                           1.5mb                            3.5mb                                    1.5mb                                  1.5mb                    3.2mb 
 1.9gb        1.2mb                          3.4mb                                3.3mb                     1.1mb                                             1.5mb                3.5mb       1.9gb                       1.9mb              1.8mb     3.5mb      3.6mb            1.7mb               1.9mb     1.7mb                      1.5mb                                           1.5mb                            3.4mb                                       0b                                  1.5mb                    3.2mb 
   2gb           0b                             0b                                   0b                        0b                                                0b                   0b         2gb                          0b                 0b        0b         0b               0b                  0b        0b                         0b                                              0b                               0b                                       0b                                     0b                       0b

1.5.2:

  total index_stats.index node.id node_stats.node_id buildNum endTime location.timestamp userActivity.time startTime   time shard.state shard.node indoorOutdoor.time shard.index dataThroughput.downloadSpeed 
176.2mb                0b      0b                 0b     232b 213.5kb            518.8kb           479.7kb    45.5mb 80.1mb       1.4kb       920b            348.7kb       2.5kb                       49.1mb

curl /_nodes/stats result gist

dadoonet · April 12, 2016, 6:12pm

That's really too many shards per node IMO.

Imagine a shard as if it was a database. Would you start around 1000 databases instances on a single node?

So either decrease the number of shards, or increase the number of nodes.

My 2 cents

radap · April 12, 2016, 11:16pm

please see my EDIT

warkolm · April 12, 2016, 11:22pm

Get rid of the empty indices! They are a massive waste.
Why bother with TTL if you are curating the indices? It too is a waste of resources.

Topic		Replies	Views
Cluster stuck on high JVM heap usage Elasticsearch	4	982	July 5, 2017
Elasticsearch slows down over time Elasticsearch	7	3110	October 22, 2019
35 shards but maxing out JVM heap Elasticsearch	12	4326	April 5, 2018
JVM Heap Size Elasticsearch	5	1382	October 19, 2017
Memory requirements and settings Elasticsearch	8	3036	July 6, 2017

Elasticsearch 1.5.2 deployment issue

Related topics