Logstash target on Elasticsearch Architecture

negrote · October 4, 2018, 1:19am

Hi,

I have the following architecture to elasticsearch

3 dedicated master eligible nodes (8 GB RAM + 2 CPUs)
3 data/ingest node (64GB RAM + 8 CPUs)
1 coordinating node (24 GB RAM + 4 CPUs)

Which the better approach? Configure logstash to send outputs to coordinating node or change data node to be data/master node and send logstash outputs to data/master nodes?

The coordinating node will be receive Kibana requests.

Christian_Dahlqvist · October 4, 2018, 7:06am

Another option would be to keep the dedicated master nodes and configure Logstash to send data directly to the data/ingest nodes. If dedicated master nodes are not required and you change the layout, you could send data directly to the master/data/ingest nodes.

negrote · October 4, 2018, 2:51pm

Thanks Christian

I think I can mantain the following architecture:

3 master deticated (Cluster management)
3 data/ingest (logstash send directly)
1 coordinating only (clients consumer)

With this architecture I think that avoid a currently problem for duplication data when new index are create.

Do you have any tip to avoid duplication data on index creation?

Thanks

Christian_Dahlqvist · October 4, 2018, 3:00pm

I am not sure I understand what you are referring to. Can you please clarify?

negrote · October 4, 2018, 4:08pm

Sure!!

I have a duplication data always that new index are created. My index are generated daily.
I need to avoid duplication data on index creation. I think that my currently architecture do not support the the currently load of data and the bulk API receive several timeout. Below the error identified.

[2018-06-17T20:00:47,039][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [cdv1prgrafapv03-node-1] failed to put mappings on indices [[[emm_001-2018.06.18/y7YonDWiTo2AeED-mYxKFA]]], type [doc]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$null$0(MasterService.java:122) ~[elasticsearch-6.1.0.jar:6.1.0]
at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_65]
at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:121) ~[elasticsearch-6.1.0.jar:6.1.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:568) [elasticsearch-6.1.0.jar:6.1.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_65]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_65]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_65]

Christian_Dahlqvist · October 4, 2018, 4:44pm

It seems like cluster updates are taking a really long time, which is causing problems. How many indices and shards do you have in your cluster?

negrote · October 4, 2018, 6:30pm

I have 24.572 shards and 2.458 indeces

Christian_Dahlqvist · October 4, 2018, 6:36pm

That is far too many indices and shards for a cluster that size. Please read this blog post on shards and sharding for some practical guidelines and then change you you shard your data so you reduce this number dramatically.

system · November 1, 2018, 6:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Help with Elasticsearch Cluster Elasticsearch	7	905	June 22, 2018
To which elasticsearch node should logstash send to Elasticsearch	12	8904	December 22, 2017
Elasticsearch-Output to Elasticsearch-Cluster Logstash	5	1241	December 30, 2016
Elasticssearch cluster outage scenarios Elasticsearch	5	764	August 6, 2019
Coordinating node for load balancing both Kibana as well as logstash Elasticsearch	1	327	August 27, 2019

Logstash target on Elasticsearch Architecture

Related topics