Methods to loadbalance on logstash from Input data stream to Filter section

Shreesh_Narayanan · April 26, 2021, 7:14pm

I'm looking for methods to loadbalance on logstash or elasticsearch . The requirement is that i have decently large log files with thousands of lines, while logstash is perfectly processing the log file as expected, i want to know if there's way to reduce the time taken to do so .

is there a way to direct the input data stream to other instances of logstash in the same/another machine to filter/process it ?

input {
file  {
path => "/u0/sn/input_directory/*.gz"
mode => "read"
sincedb_path => "/tmp/main.db"
file_completed_action => "log"
file_completed_log_path => "/u0/sn/input_directory/output.txt" 
}    
}

Badger · April 26, 2021, 7:36pm

This is a really complex question which cannot really be answered in a forum like this. Step one is to identify the bottleneck in the ingestion process. Is it elasticsearch or logstash? Is the process CPU limited? IO limited? If logstash is it the input or the filters in the pipeline that are limiting ingestion?

If the limit is the input then it might help to use multiple inputs, each processing a subset of *.gz.

It is certainly possible to configure logstash to divide traffic between other logstash instances. You could use something like

filter { ruby { code => 'event.set("[@metadata][target]", rand(3))' } }
output {
    if [@metadata][target] == "0" {
        output { ... }
    } else if [@metadata][target] == "1" {
        output { ... }
    } else {
        output { ... }
    }
}

One way of connecting logstash to logstash is described here.

Shreesh_Narayanan · April 26, 2021, 9:09pm

There's been no bottleneck yet, i'm in the beginning stages of setting up logstash to filter a huge *.gz file . I was just looking at methods to trim down the time taken for the input to be filtered .

Its currently not CPU or IO limited,logstash is working better than expected with the preliminary tests .

Thanks for the document on connecting logstash to logstash

system · May 24, 2021, 9:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Improve performance of Logstash data loading into ES Logstash	16	4307	July 18, 2017
Faster speed when indexing log files Logstash	11	5207	July 6, 2017
Logstash "input" performance? Logstash	9	1844	July 6, 2017
Logstash one single point failure Logstash	6	1261	January 12, 2018
Load balancing logstash Logstash	2	1343	July 6, 2017

Methods to loadbalance on logstash from Input data stream to Filter section

Related topics