Which the better approach? Configure logstash to send outputs to coordinating node or change data node to be data/master node and send logstash outputs to data/master nodes?
The coordinating node will be receive Kibana requests.
Another option would be to keep the dedicated master nodes and configure Logstash to send data directly to the data/ingest nodes. If dedicated master nodes are not required and you change the layout, you could send data directly to the master/data/ingest nodes.
I have a duplication data always that new index are created. My index are generated daily.
I need to avoid duplication data on index creation. I think that my currently architecture do not support the the currently load of data and the bulk API receive several timeout. Below the error identified.
[2018-06-17T20:00:47,039][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [cdv1prgrafapv03-node-1] failed to put mappings on indices [[[emm_001-2018.06.18/y7YonDWiTo2AeED-mYxKFA]]], type [doc]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$null$0(MasterService.java:122) ~[elasticsearch-6.1.0.jar:6.1.0]
at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_65]
at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:121) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:568) [elasticsearch-6.1.0.jar:6.1.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_65]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_65]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]
That is far too many indices and shards for a cluster that size. Please read this blog post on shards and sharding for some practical guidelines and then change you you shard your data so you reduce this number dramatically.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.