Hi all!
I have inherited a simple Elasticsearch + Kibana set up, and I have no prior experience with either.
I am getting the error in the title:
In my elasticsearch log I am getting lots of this:
[2019-09-02T01:46:35,262][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [0OS_DK6] failed to put mappings on indices [[[production-2019.08.18/CoiTYb_tSGSjit3_coDbzw]]], type [syslog]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.6.14.jar:5.6.14]
at java.util.ArrayList.forEach(ArrayList.java:1257) ~[?:1.8.0_222]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:576) [elasticsearch-5.6.14.jar:5.6.14]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
[2019-09-02T01:46:35,263][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [0OS_DK6] failed to put mappings on indices [[[production-2019.08.18/CoiTYb_tSGSjit3_coDbzw]]], type [syslog]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.6.14.jar:5.6.14]
at java.util.ArrayList.forEach(ArrayList.java:1257) ~[?:1.8.0_222]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:576) [elasticsearch-5.6.14.jar:5.6.14]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
[2019-09-02T01:46:35,270][DEBUG][o.e.a.b.TransportShardBulkAction] [0OS_DK6] [production-2019.08.18][3] failed to execute bulk item (index) BulkShardRequest [[production-2019.08.18][3]] containing [3] requests
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.6.14.jar:5.6.14]
at java.util.ArrayList.forEach(ArrayList.java:1257) ~[?:1.8.0_222]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.6.14.jar:5.6.14]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:576) ~[elasticsearch-5.6.14.jar:5.6.14]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
If I run:
curl http://localhost:9200/_cluster/health?pretty
I get:
{
"cluster_name" : "elasticsearch",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 8472,
"active_shards" : 8472,
"relocating_shards" : 0,
"initializing_shards" : 4,
"unassigned_shards" : 8896,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 7,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 3406893,
"active_shards_percent_as_number" : 48.76813262721621
}
Running:
curl -s http://localhost:9200/_settings
Gives me a lot of output, so here's a small part:
{"testing-2018.07.18":{"settings":{"index":{"creation_date":"1531872013861","number_of_shards":"5","number_of_replicas":"1","uuid":"CDlfimO2RlWHDFe26LkFMw","version":{"created":"5060999","upgraded":"5061499"},"provided_name":"testing-2018.07.18"}}},"testing-2019.07.07":{"settings":{"index":{"creation_date":"1562457610057","number_of_shards":"5","number_of_replicas":"1","uuid":"VZpPPsSfRh-mYXAtFBqDWw","version":{"created":"5061499"},"provided_name":"testing-2019.07.07"}}},"testing-2019.06.09":{"settings":{"index":{"creation_date":"1560038403742","number_of_shards":"5","number_of_replicas":"1","uuid":"kSCfg7r5QrOgCzdvth06OQ","version":{"created":"5061499"},"provided_name":"testing-2019.06.09"}}},"staging-2018.09.01":{"settings":{"index":{"creation_date":"1535760006543","number_of_shards":"5","number_of_replicas":"1","uuid":"d4H1HRopTqyWgVwTTPN4pA","version":{"created":"5060999","upgraded":"5061499"},"provided_name":"staging-2018.09.01"}}},"infra-2018.10.29":{"settings":{"index":{"creation_date":"1540771208498","number_of_shards":"5","number_of_replicas":"1","uuid":"vLMqkuuwR_uAgxZeABAQTQ","version":{"created":"5060999","upgraded":"5061499"},"provided_name":"infra-2018.10.29"}}},"infra-2019.07.20":{"settings":{"index":
And I tried:
curl -s http://localhost:9200/_cat/shards?v
But that took a long time, and didn't print any output.
After a bunch of web searching, I get the feeling that 16,000+ shards for 1 node is too many (to put it lightly?). So my current thinking is to follow the instructions to shrink the index in order to reduce the number of shards. But I don't want to rush into a solution given this is all new to me. I am also unsure how many to shrink it to.
Am I on the right track? Would that fix it?
Info (as best as I can describe):
We have Elasticsearch and Kibana running on a single server with a single node. It is running on a t3.large AWS EC2 instance. As far as I know, there are 3 log sources (3 servers sending logs).
Let me know if I can provide any more info.