Currently, I am running Elasticsearch in cluster mode and there are 3 nodes. Approximately I have around 15000 indices. Each node has 64GB memory and 16 core CPU.
My logs are flooded with below message whenever I try to do some operations on any index through Java client.
Exception: Failed to execute query: ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [IndexName]) within 30s]
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:349) [elasticsearch-2.4.1.jar:2.4.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_72]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_72]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_72]
Hi @mujtabahussain
All are master and data node. There is no dedicated master. I am using java client to create index on node1.
Some other logs are
[2017-10-10 17:31:59,733][DEBUG][action.admin.indices.create] [vm14005] [indexname] failed to create
ProcessClusterEventTimeoutException[failed to process cluster event (create-index [indexname], cause [api]) within 30s]
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:349)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index1][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.translog ] [vm14002] [index2][0] interval [5s], flush_threshold_ops [2147483647], flush_threshold_size [512mb], flush_threshold_period [30m]
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index3][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index4][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index5][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index6][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index7][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index8][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index9][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index10][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index11][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
[2017-10-10 17:31:52,084][DEBUG][index.shard ] [vm14002] [index12][0] updating index_buffer_size from [13mb] to [13mb]; IndexWriter now using [0] bytes
Can you quickly try and create that index and associated mappings with the REST API Client and see if you still get the same error? This should isolate whether the issue is in the Java Client or the setup of the cluster.
Yes, those are the time-based indices. I have around 75 different category and each customer has around 20 indices. These indices are rolled over when they hit 20GB.
That is the limit what I am maintaining for each index. Periodically I check the size of the index and if the size is exceeding then I create another index and move write alias to the new index.
Not sure I follow you here. You only have one primary shard, so how does that distribute across the index?
What do you mean by the last part of that? If you are using time based indices why wouldn't the data be available if the index exists?
That's good to hear!
But the problem is you have fixed one problem and created another due to over sharding. As things stand, you are going to be wasting a lot of resources on just managing shards, and it's likely causing this timeout.
If you double your shard size you halve the number of needed shards, that's a great start, and it'd be interesting to see what that means for your performance.
@warkolm Maybe I am going in the wrong direction as u suggested. Thank you for ur suggestion. Last QQ. How much data can I store in one index in terms of size. Will it impact query and aggregation?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.