Is the input speed of the elasticsearch cluster linear?

(moyang) #1

Hello guys, I am a newbie in elasticsearch. Recently I set up a elasticsearch cluster to save my logs.The cluster has 3 nodes, all of them serves as both master node and data node. I used logstash to foward data to the AWS load balancer, which will then seperate load to the 3 nodes.

The problem is, the throughput is really low, besides, I realized that it is not linear. With one node in the cluster, the throughput is 3666 events /sec when inputing. With two nodes, the throughput is 2750 per node per sec. When I am using all three nodes, the throughput is only 1666 per node per second.

Is the performance supposed to be like this? Or should the throughput linearly increase when adding nodes into the cluster? Hope you guys can give me some help. Thanks in advance.

(Mark Walkom) #2

It should increase by adding more nodes.

What does your LS output config look like? What instance size are you using?

(moyang) #3

Thanks for answering workolm.
My current situation is that I need to process at lease 34000 events per second, and I am trying to estimate how many notes do I need to process in this speed. I noticed that the processing speed is not linear as the number of node increases, so I wonder how to measure the exact node number.

Here is my LS ouput:

output {
elasticsearch {
bind_host => "localhost"
cluster => "elasticsearch-prodII"
workers => 8
protocol => "http"
flush_size => 200

(Christian Dahlqvist) #4

If you have the standard 1 replica configured, I would expect 2 nodes to perform about the same as a single node as both nodes would hold and index all data. Once you add nodes from there on, you should see improved throughput assuming you do not change the number of replicas.

This also assumes that you are sending traffic to all nodes and that the nodes are not deployed on a single node where they fight for the same resources.

Your output configuration seems to suggest that you are sending all load to a single node. Is that the case? What type of hardware configuration are you using?

(moyang) #5

Hello Christian, thanks so much for answering.

My cluster structure is that I have three nodes, each running a redis server, a logstash and an elasticsearch, and the three nodes are in the same elasticsearch cluster. These three nodes all act as both data node and master node. All the data input comes from an AWS load balancer, which seperates the load equally to the three nodes' redis server. The logstash reads the data from the redis server, and then send to its local elasticsearch. This is why my LS output is sending to localhost.

To wrap it up, I'm using 3 nodes as a cluster, each running logstasn and elasticsearch. The logstash send the data to the local elasticsearch.

I am trully a newbie, I should learn more about this. Do you think this structure works?

As for the hardware configuration, I am using AWS m4.xlarge node. Which has 4 core, 16G RAM. They claim the m4 nodes are using intel Xenon processor, but after all, they are VMs, so don't raise your expectaton :smile:

Thanks for answering !

(Christian Dahlqvist) #6

I would recommend monitoring the servers during indexing and trying to identify what is the bottleneck and limiting performance, e.g. CPU, disk I/O, memory pressure. As both Logstash and indexing in Elasticsearch can be CPU intensive, that would be what I check first.

(moyang) #7

Great advice. Thanks so much Christian.
I noticed that the cpu is the bound. It is nearly fully occupied the whole time.

(Nik Everett) #8

The hot_threads will be useful there, I think.

Usually the right way to get optimal performance is to have one copy of the shard per server - so if you have 6 servers you want your index in 3 shards with 1 replica for 6 ( (1+1)*3 ) total indexes. There are lots of parameters you can tweak to sacrifice query performance for indexing performance. Have a look at "update index settings". hot_threads will probably lead you somewhere though.

(moyang) #9

Thanks so much Nik! You pointed out a new direction for me. I'll go check it out.

(system) #10