Huge Time Delay between logstash and elasticsearch

ApoorvaLad · March 28, 2016, 7:08am

My current pipeline is:

Rsyslog -> Kafka -> Logstash -> ES (5 nodes)
I see a huge time delay (Arnd 10hrs) between the logs processed by Kafka and the logs in my ES cluster.
the input and output plugin in logstash looks like this:

input {
kafka {
zk_connect => 'host:port'
topic_id => 'abc'
consumer_threads=> 50
codec => json
}
}

Kafka currently has 50 partitions.

output {
elasticsearch {
template => "/export/logstash_new/elasticsearch-template.json"
hosts =>["host1","host2","host3","host4","host5"]
template_overwrite => true
manage_template => true
codec=>plain
}
}

When I run my pipeline for a shorter duration (2 to 3 hours), no time delay is noticed.
However, gradually, as the time increases, the delay increases too.

How can I figure out where the problem currently resides?
Is logstash failing to process data? Or is there a problem in the indexing of Elasticsearch?

Current load of data is 6k messages per minute. (The load fluctuates)

warkolm · March 28, 2016, 7:40pm

Are you monitoring your kafka topic lengths, as well as the rest of the pipeline (CPU etc)?

ApoorvaLad · March 30, 2016, 6:03am

Hi.. The problem got solved. The issue was in the indexing of Elasticsearch. There were 5 nodes in the cluster. Out of which 4 of them were data nodes and 1 was master as well as data node. In the logstash output plugin, I had given all 5 nodes.

The issue was solved when I created a separate client node for the communication between Logstash .

So now my ES cluster looks like this:

1 Client node (Role - Handing requests, load balancing)
1 Master node (Role - Cluster health management)
3 Data Nodes

Christian_Dahlqvist · March 30, 2016, 6:40am

Running with a single master eligible node creates a single point of failure. You should look to have 3 master eligible nodes in the cluster (with minimum_master_nodes set to 2) in order to improve resiliency and availability.

ApoorvaLad · March 30, 2016, 6:52am

Okay... Thanks ..

Topic		Replies	Views
Huge delay in logs to ES from Kafka/Logstash Elasticsearch	5	2137	August 4, 2020
Delay between Logstash and ES? ELK 7.16.1 [ECK] Logstash	16	2297	March 14, 2022
Data delay in ELK Logstash	11	4682	February 22, 2017
Data delay writing to ES Logstash	6	1159	March 3, 2018
Finding bottleneck in pipeline Logstash	9	1530	March 1, 2022

Huge Time Delay between logstash and elasticsearch

Related topics