Easiest Way to loadtest elk, especially querying ES?


we are running a single instance of ELK. All is running on the same server.
Processing in logstash seems fast enough, because inserted data is shown up to date in kibana.

But we have performance issues on querying. Often we get timeouts or exceptions, that es rejected because of full queue.

I wanted to try to optimize the shard size and the relation between shard and type.
But how can I loadtest the querying of es.

Is there a log where ES or kibana is logging the post data which I can fetch and reuse it in a jmeter testcase for example?

First I want to replay our productive queries against elasticsearch and find a configuration where it runs without errors. After that I would like to stress the system and find out the max usage and tune the configuration more.

Thanks, Andreas

install the X-Pack marvel is free and will give you the statistics you need.

having more shards on one instance of elastic is not going to help much, unless your running multiple data nodes to different partitions. Your greatest issue will be CPU contention and IO Bottlenecking

If I may ask what kind of hardware and how do you have it layed out?

Hi @asp,

at Elastic we use Rally for benchmarking Elasticsearch (disclaimer: I am the main author of Rally so I may be a bit biased ;)). You can setup the necessary experiments but I think it will take a bit of time. A description of such an experiment is called a "track" in Rally and the docs describe how you can write a custom track.

Your use case also sounds very familiar to a track that my colleague Christian Dahlqvist has created, the eventdata-track. Maybe you can borrow some ideas from his track.

Christian and I also gave a talk about cluster sizing with Rally at Elastic{ON} 2017 but unfortunately there were AV problems so there is no video online. However, I have uploaded the slides.

An alternative may be to use JMeter for executing the queries but you still need to setup the cluster yourself, populate the index and save the results in a format which allows you to compare the results of multiple experiments. Instead, Rally can set up the cluster for you (admittedly, somewhat limited still) and can capture all results in a dedicated Elasticsearch metrics store and you can analyze the results of your experiments in Kibana afterwards (you need to create your own visualizations based on the metrics records documentation).

We also have a dedicated forum for Rally here on Discuss where you can ask your Rally-related questions.


1 Like

We are running a single virtual machine on RHEL7.3.
8 CPU, 28 GB RAM.

Since splitting virtual hardware to multiple VMs has higher costs than running them on one machine, we would like to run on a single host.

Storage is aquired via NAS, which is quite a black box for us.

right virtualization is not always the solution :slight_smile:

Ok, some issues I have experience, your going to get a lot of CPU contention , and IO Bottle necking

Elasticsearch it's likes to take all the CPU of a system , so it will collide with Logstash if running on the same system.

Elasticsearch works well with lots of little instances, I recommend you setting the processor value in the elasticsearch.yml to be like 4

And in Logstash, I don't know the new way of configuring it yet but limit the number of worker threads to like 2

Both these setting will adjust the internal thread queues which will optimize the behavior of the product.

Also, if your running multiple elasticsearch engines on the same server. I recommend that you create multiple mount points, Even with 1 LUN there are Kernel limitations/in-inefficiencies with concurrent reads and writes. Having multiple LUN , shows to relieve IO Wait.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.