Logstash configuration to process 1M events per second

I have events of around 1M (real time), logstash is proecessing 40K at one instances and one pipeline, if I increase the logstash instance to two it goes around 30K each (60K cummulative) but increasing the logstash instance to 3 is limiting the events per second to 70K (cummulative) only, how will i be able to process 1M events at logstash .
I have tuned the kafka input plugins as per the kafka plugin guides of the logstash and the number of partition at kafka topic is 600

Sounds like you need more logstash instances with more memory and workers and possibly use some ingest pipelines on your Elasticsearch as you may be running into a bottleneck with writing the data into Elasticsearch.

I am already using four logstsash ,with 24 consumer thread, 32 pipeline workers, 24 CPU , and 32GB of Heap Size on each Node. The pipeline ID of all the logstash are different (tested with the same pipeline ID as well ), but the Kafka consumer group of all the four logstash is same (as our requirement). But I observed the load consumption rate by logstash is not growing exponentially, rather it is dividing the same load consumption records among 4 nodes with slight difference.

Like One logsasth : 112K consume rate
Two logstash Nodes : 65K and 77K
Three Logtsash Nodes: 55k arouf each

If that's the case then you may be running into a bottleneck with your Elasticsearch (possibly storage?).
You can look into "Rally" to try and benchmark your Elasticsearch instances to see how they do with high volumes. Adding some monitoring to your logstash instances and Elasticsearch will also tell you if you are capping on any resource constraints (CPU/IO/RAM, etc...)

I have 4 Hot Nodes (CPU :48, Java Heap: 30GB) , 5 Warm Nodes (CPU :32, Java Heap: 30GB) , I guess you are right, according to my observation the CPU crunch on any one hot node ( CPU usage > 90% and JVM heap usage> 60%) is also leading the backpressure towards the logstash and hence reading less data. But adding one more hot node still haven't solved our problem.

Then take a look at your storage subsystem, you may be reaching some problems with it.

Are you using SSD storage in the hot nodes, right?

Also, what is the value of the refresh_interval of your index? Did you create a mapping for your index?

Yeah we are using SSD and for Elastic Warm Nodes we are using SATA storage, the refresh_interval I increased from default 1 sec to 10 sec but still no improvement in the indexing rate of elastic warm node ...

Load is of 7Lacs messages but still the logstash pipeline eps is getting 144 eps from two instances, and same 144K is the indexing rate of the elastic warm node.

We have checked for the indexing pressure also but it seems there is no such pressure also .

I have incresead the number of hot nodes from 5 to 10 , the performance only increased from 80K eps to 144 eps (atleast it should be doubled).

Note: 10 primary shards are there, means one shard at each hot node. Making is 20 or two shards per node is increasing the CPU utilization to 90%

Are you doing the ingestion on the hot nodes, the warm nodes or both?

If you have index with shards in both a hot node and warm node, the index speed will be limited be your warm nodes, since you use SATA storage which can be pretty slow compared to the SSD.

Do you have a hot-warm architecture with the index-level shard allocation configured or you just have nodes with different specs?

I do not know what is 7 Lac, is it 7 k? 70 k? 700 k events?

Also, Do you have replicas for your index?

sorry for the late reply!

I have a hot-warm node architecture but without index-level shard allocation, Do I need to set this for higher indexing rate ??

I am doing the ingestion only on hot node as setting for hot node :
node.attr.box_type: hot
node.roles: [data, ingest]

while on warm :
node.attr.box_type: warm
node.roles: [data]

7 Lakh i.e 700K is the ingested load.

Yes I have 2 replicas and 10 primary shards.

If you do not have index-level shard allocation, then you do not have a true hot-warm architecture as both of your nodes have the data role, so they are the same for your cluster and it can allocate shards in any of the nodes and the indexing speed will depend on the slowest node.

For example, if you have a index with a shard on a hot node with ssd and another one in an warm node with sata, your indexing rate will depend on your sata disk speed.

You need to have something like this in your index templates:

      "index" : {
        "routing" : {
            "allocation" : {
              "require" : {
                "box_type" : "hot"

This will make Elasticsearch index on the hot nodes only, and you will need to use an Index Lifecycle Policy to move the shards for the warm nodes after sometime, or do it manually.

Setting this index level shard allocation also has not improved our ingestion rate. How the size of the document also effecting the indexing size ?
Heap Size is also significant and is available 50% all the time.