Configuring a Cluster for High Throughput

jhop · June 26, 2024, 7:52pm

I have tried to load data into Elasticsearch real-time using Python. I am new to Elasticsearch. I used a single node and a single server. Elasticsearch was not able able to keep up with the real-time data coming into the server and there was a pretty large backlog as it did not seem to be able to keep up with the throughput.

I went to an Elastic event and was told I need to create a cluster. I began doing research into clusters and discovered there are different types of nodes.

Master nodes
Data nodes
Client nodes
Ingest nodes

More of what type of node(s) will help with the throughput problem? Any ideas on how to begin designing my cluster?

I assume the amount of memory is a factor. To what degree? If so how do I calculate it?

Thank you in advance. Any help would be appreciated.

DavidTurner · June 26, 2024, 8:21pm

Are you sure your client was pushing Elasticsearch hard enough? It's very common for throughout problems to be on the client side, nothing to do with Elasticsearch config. Make sure you're sending lots of large bulk requests in parallel.

If you've confirmed that Elasticsearch really is the bottleneck then the limiting factor is probably either I/O bandwidth or CPU count, and you can fix both of these by scaling your one node up to have more power (to some extent) or by adding nodes (almost arbitrarily far). I wouldn't worry about different node types, just use the default which is for every node to do everything. You can refine that later if you want but the default is going to be the simplest way to increase performance.

jhop · June 26, 2024, 8:35pm

So I can configure a node to be more than one type when I make a cluster?

DavidTurner · June 26, 2024, 9:50pm

Yes.

I mean, you already have a (one-node) cluster where all the (one) nodes do everything. That's the default.

Topic		Replies	Views
Would dividing the same resources over more nodes improve performance? Elasticsearch	6	176	January 10, 2024
Fine Tuned Cluster - Consultation Elasticsearch	2	596	July 23, 2017
Maximize read/write throughput Elasticsearch	11	7124	October 1, 2019
Index throughput issues - tried all tuning suggestions posted Elasticsearch	1	381	July 6, 2017
ElasticSearch Performance tuning with 3 nodes Elasticsearch	18	848	June 23, 2020

Configuring a Cluster for High Throughput

Related Topics