Logstash Peformance

Hello,

I would like know tips to increase performance of Logtash.

Currently i have :

OS RH 7.1 : 12 CPU & 24 Go RAM

Filebeat as shipper

Elasticsearch : Xms Xmx : 12g

Logstash (define in logstash.yml) :
Workers 12
Pipeline batch size : 2000

I haven't modify options like pipeline batch delay, pipeline output worker, Java heap..)

For Logstash, what do you precognized to better performance ?

PS : I have 1,5 millions of logs lines every five minutes.

How have you gone about identifying that it is Logstash that is the bottleneck?

Because Elasticsearch is a good DB.

Logstash is low (180k/minutes) to upload line in elasticsearch

What is the specification of your Elasticsearch cluster? Have you performed any benchmark to see what the throughput limit of Elasticsearch is? Have you tried replacing the elasticsearch output in Logstash with a file output to see if this changes the Logstash throughput?

I have only one node. Only one server to differents soft of elk.

I don't know how to get informations that you talk :confused:

The Java heap size for logstash have impact of performance ?

Currently i have :

-Xms256m
-Xmx1g

What does CPU and disk I/O usage for Elasticsearch look like when you are indexing? Do you have monitoring installed?

With iostat command line i view :

CPU :

avg-cpu:  %user     %nice   %system     %iowait  %steal   %idle
           0.93     1.75      0.87       0.03     0.00    96.41

DISK :

     Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
        dm             23.86     7.05       277.95      10208977  402426000

Is this from the Elasticsearch node while you are indexing?

Yes during indexing of documents

What is the specification of your Elasticsearch host?

I have use default installation.

I have just changed XMS and XMX java.

How many CPU cores does your Elasticsearch host have? What type of storage?

12 CPU, local storage (EXT4)

One things, when i parse and index data, my load average is low i find :

Load Average : 5,50 2,57 1,04

I don't know if it's important or no.

What does your Logstash configuration look like? Which version of Logstash are you using?

I use 5.1 version.

Logstash conf parsing :

if [type] == "edr" {
if [message] =~ "\bCTEName\b" {
drop { }
} else {
csv {
columns => [ "..." ]
}
}

mutate {
    convert => {
               "edr_SrId" => "integer"
               "edr_Uets" => "integer"
               "edr_GtalOctets" => "integer"
               "edr_Uimit" => "integer"
               "edr_Jer" => "integer"
               }
    remove_field => [ "message", "edr_D", "edr_IlltureFlag", "edr_UatureFlag", "edr_NP", "edr_Tpe", "edr_OSe", "edr_Oum", "edr_Oature", "edr_Monitor", "edr_Faig", "edr_Qbel", "edr_FlowStatus", "edr_Redabel", "edr_NoonID" ]
       }

How many fields do you have? What is the average size of your events?

I have 30 fields by line and about 1,2 GB per days of index.

Just if you tell me how change de memory buffer size. The documentation says that the value is static but i don't find where configure it.

By default it's 10% but i think i can grow at 20%

Hi,
Maybe this link could help you :