Need advise on an Kafka-ELK cluster and Logstash settings to improve indexing rate

Dear all,

I am trying to build an ELK cluster together with Kafka to get the best indexing rate in Elasticsearch. Following the description of my cluster hardware and configuration below :

  • is the architecture correct for my purpose ?
  • if no, how can I improve it ?
  • how can I improve the indexing rate while reducing logstash batch size so as to reduce required memory by logstash ?

Here the story :
I have four nodes with each 32GB of memory, two 7200rpm 2To HDD and a Core i7-7820X at 3.6Ghz. I want to build an ELK cluster over 3 nodes and keep one node and its resources for source messaging. Kafka as a message bus over the 4 nodes consumed by the Logstash instances.

The data path for Kafka and Elasticseach contains two directories, one for each HDD so has to take advantage of stripping.

Detailed Architecture :
Node1 : message producer.
Node 1.2.3 : Zookeeper
Node 1, 2, 3, 4 : Kafka
Node 2, 3, 4: Logstash, Elasticsearch
Node 4: Kibana

The messages to index once in JSON is around 400 Bytes. Kafka topics has 10 partitions each.

Logstash pipeline input forms one Kafka consumer group per topic. Logstash kafka bootstrap server setting is configured to the localhost to consume the Kafka instance running on same host. It is to avoid useless network communication between Kafka nodes. However not sure if Kafka forces a consumer to read partitions on another host beside the given bootstrap server setting.

Elasticsearch index has a template to convert each field of the message to the correct format. It also has a pipeline to automatically generates a monthly topic based on the message timestamp and to discard two identic messages. Each index is configured to use 36 shards. Measured shard size does not go over 2GB for a year of messaging.

Memory settings:

  • kafka has JMX/JMS to 1Go.
  • Logstash has JMX/JMS to 12Go.
  • Elasticseach has JMX/JMS to 15Go.

Measured indexing rate: 15000 messages/s with a logstash batch size of 40000.

At some point Elasticsearch hangs with bulk exceptions, hot threads shows time spent on flush operation. This is because of a too large batch size setting. If I reduce the batch size to save some memory for OS filecaching, the indexing rate drops significantly but Elasticsearch remains stable over time.

Just a few notes/questions on the described setup:

  • Rotating HDDs are probably your biggest bottleneck, doubly so when VMs host both Kafka nodes and ES nodes.
  • Relevant to the above, 36 shards per index seem a bit much (especially when shard sizes are expected to be small as you mentioned). Many small shards can put a strain on your cluster. You can have a look at https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster and https://www.elastic.co/guide/en/elasticsearch/guide/current/inside-a-shard.html if you haven't already.
  • I would probably keep ES nodes and Kafka nodes separate, since they are both I/O-intensive. Logstash is less so.
  • What is your expected or target throughput?
  • Did you spread Kafka/Logstash over 4/3 nodes purely for high-availability purposes or for performance purposes as well (load-balancing, node saturation) ?
  • I would suggest you perform some benchmarks on isolated components, so you can get an idea about performance before scaling out and mixing services. e.g. :
    • A single-node Logstash max throughput with just the filters and a null output.
    • A single-node ElasticSeach indexing rate (Have a look at https://github.com/elastic/rally)
    • A single-node Kafka throughput rate.

Thanks @paz for your quick answer,

Indeed the more HDD, the better, but for now I can't go over 2 HDD...
I am trying right now different number of shards to get the impact on the indexing performances. On one machine, 2 disks, I measured 10K message/s, going to 2 shards it dropped to 8700 m/s and to 3400 m/s with 10...

Indeed I read the document on the number of shards. I computed my number of shards as follow: number of disks * number of nodes + 3 (replicas) * 4 (for future growing). But as I am stripping the disks, I guess I can move to 24 and probably less as you said. Nevertheless, I am benchmarking this setting right now.

Is it a good improvement going from 10000 m/s with one node to 15000 m/s with 3 Elastic nodes ?

Yes, I spreaded Kafka/Logstash and ES for high availability. And unfortunately I am bounded to 4 nodes. So I am trying for the best balanced configuration between high availability and performances...

About the throughput I try first to find the config to obtain the best indexing rate. It allows me to better understand the key parameters/design. Then I would load the cluster with queries and will have to tune again the config. At the end I also plan to have some Spark cluster services on the nodes performing some computation. So I'am thinking also to restrict some node on indexing and some other on queries...

What about ES memory ? Right now I set 10Go with JMX/JMS and keep at least the same for the OS, but it restricts me to 6Go for Logstash which seems to not be a lot as when I increase the Logstash batch size to improve performance, memory goes full and system hangs...

If you are using a replica shard that is expected. Once you add 1 node you get no additional throughput as both nodes do the same amount of work, indexing into the primary and replica shard. Once you add another node you have increased capacity by 50% which is shown by the 50% increase in throughput you are seeing.

Just to be on the same page, when you say "HDD" you mean rotating disks, correct? If so, moving to SSDs would give you a pretty notable performance boost.

Ok thanks @Christian_Dahlqvist , I totally missed the indexing between primary and replica ! So the result seems consistent at least.

Have you an idea if 15000m/s is correct given the hardware and configuration ?

@paz, I also did some time ago some benchmarks but on generic message of 100Bytes average size. On one machine with Kafka only I could reach 1M m/s. On ES, with message with one field to parse (using ES template) I could reach 75000 m/s, with ten fields I could reach 25000 m/s but using a Logstash batch size of 40000. But with this value, the system hangs after sometimes because of ES being not able to process anymore the bulk requests.

Yes, rotating disk. Unfortunately 2To in SSD is not the same price.

For the record, my usage is a bit at the opposite of the large cluster config with tones of big machines we can see around ES. The idea is to build a minimal setup which allows high availability, with best performance/price ratio in hardware and getting the maximum from that :slight_smile:

Indexing speed will always depend on the amount of work that need to be done per document, so the number of events per second is a poor indicator of indexing performance unless put in context.

It is always recommended to run Elasticsearch on dedicated hardware so it does not have to compare with other processes for resources. If you have multiple processes on the same server it can be hard to track down bottlenecks.

Yes indeed it is what I already observed by just only increasing the number of fields in the messages... So yes having several resource intensives processes on the same machine is difficult to compare.

However, I am still questioning my self if Logstash pipelines are better to be running on each ES nodes or should I move them to node1 for example (dedicated to Kafka) ? Because the batch size improves significantly the performances but required lot of memory...

Also, is logstash having intensive usage of the disk (for the queue for example) competing with Kafka and ES ?

If you have persistent queues enabled it can use a good amount of disk I/O.

Is that 2Tb a hard requirement? Since you mentioned something about a 2Gb yearly shard size (so ~75Gb total size).

My guess is that for strict 4-node HA setup you could do something like:
Nodes 1-2: Kafka / Logstash / Kibana
Nodes 3-4: Elasticsearch/ single-node ZK
Since you want those ES nodes as uncontested for resources as possible

I suppose for ZK a minimum of 3 nodes. But for a small cluster it does not require a lot of resources from my measurement.

But yes It may be more balanced but I loose one ES node, which involve loosing 50% improvement in indexing... But I will measure how much I get by dedicating the HDD on nodes 3-4, it may worth it indeed. Thanks for the suggestion !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.