Logstash and Ingest Nodes

jalaine · January 6, 2021, 2:13pm

Good Morning,

My team and I are trying to improve a cluster that we inherited. I apologize for the following vagueness, but due to the location of the cluster I will give as much detail as possible. We currently are working with a 7 node cluster all data nodes are dell r640s and we are on version 7.6. We also have three logstash servers inputting data into the cluster. There is a small amount of pre and post-processing happening on the logstash servers, but all of the data nodes are all still set to default. So, they are all still dilm nodes. There is a huge ingest pipeline on the cluster that is doing most of the processing and mapping of the information that is coming into the cluster.

I have been doing research and have been told that it's not ideal to be using logstash and ingest nodes. This may be true, but offloading all of the ingest processing out of the cluster and onto the logstash servers seems like a pretty major undertaking to perform on a production cluster. We are currently performing about 20,000 indexing operations per second within the cluster. I was hoping to bring two more nodes online and make them indexing only nodes. While I am at it I would like to make only 3 of the data nodes master eligible, and for the time being I am going to take machine learning off of all nodes as it is not currently in use.

I was wondering if you guys think that this is a decent or horrible idea. I am also open to any other suggestions that you believe would make this cluster function more efficiently.

Thanks

warkolm · January 6, 2021, 8:35pm

Not sure where you heard that from, but that's not true. You can use both.

What problem are you trying to solve?

jalaine · January 6, 2021, 9:28pm

Mark,

I appreciate the response. I have spent the day talking to one of the developers for the system and am now confused about the setup of our system. I am going to hold off asking any questions until I get a little more clarification. Thanks very much for your time.

warkolm · January 6, 2021, 9:56pm

No worries.
Feel free to ask any questions

jalaine · January 6, 2021, 10:25pm

Thanks!

jalaine · January 11, 2021, 9:04pm

Ok so I have some more information on the cluster, and have a clarifying question. I have been discussing the design of our cluster with one of the devs, and most of the unstructured data that comes into our cluster is processed by Filebeat which then pushes the documents into our cluster. Which is fine, but by default all of our nodes are still ingest nodes and there is a roughly 12,000 line ingest pipeline within our cluster state.

Would that mean that all of the data is getting processed multiple times? I know there are bulk indexing operations running frequently on our cluster. From what I have read if you have an ingest pipeline on ingest nodes in your cluster it will intercept the bulk index run the data against the pipeline and then index. Am I way off here?

Thanks,
Alex

warkolm · January 11, 2021, 9:13pm

Unless you are pushing the data to an ingest pipeline, no.

jalaine · January 11, 2021, 9:24pm

where would that be defined?

warkolm · January 11, 2021, 9:43pm

Where ever the data is sent to Elasticsearch it will be defined. eg Logstash, Beats.

system · February 8, 2021, 9:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ingest Node Versus Logstash Elasticsearch	6	5163	July 5, 2017
Connect Logstash to ingest node? Elasticsearch	5	637	July 5, 2017
No ingest node in the cluster Elasticsearch	3	1516	November 19, 2020
Logstash and Elasticsearch Ingest Node Logstash	1	298	September 17, 2019
Logstash to Ingest nodes Logstash ingest-pipeline	8	480	January 14, 2021

Logstash and Ingest Nodes

Related topics