Logstash filter and output use only one CPU core

(Mohamed) #1

Hello, I am using losgtash to parse my xml files, however I can't increase the number of workers, when I use top -h only one worker is used ~98%.

I am using 8 Go for jvm and 4 workers in the file logstash.yml


input {
  port=> 5044
filter {......}
  if [theXML][measValue][r][p]==[theXML][measType][p]{
        action => "index"
        hosts => [""]
        index => "nokia-%{SDL}-%{+YYYY.MM.dd}"


enter image description here

The output of top -h

enter image description here

My question is: Why I am getting only one worker running even if I have 4 workers in my logstash.yml?

I am using logstash6.4.


(Christian Dahlqvist) #2

What does your filter section look like?

(Mohamed) #3

My filter section:

filter { xml { source => "message" store_xml => true target => "theXML" force_array => false }

split { field => "[theXML][measValue][r]" }
split { field => "[theXML][measValue]" }
split { field => "[theXML][measType]" }
kv{ source=>"[theXML][measValue][measObjLdn]"
    field_split=> "=,"
if "access" in [theXML][measValue][measObjLdn]{

if "storage" in [theXML][measValue][measObjLdn]{

if "diag" in [theXML][measValue][measObjLdn]{

if "ops" in [theXML][measValue][measObjLdn]{

if "tele" in [theXML][measValue][measObjLdn]{

if "ntf" in [theXML][measValue][measObjLdn]{

mutate { remove_field => [ "AZ","CoreId","ConfiguredMemory","path","RGN","[beat][hostname]", "[beat][hostname][keyword]", "[beat][name]","[beat][name][keyword]","[beat][version]",
match=>["[theXML][granPeriod][endTime]","M dd yy HH:mm"]}

mutate {rename => { "[theXML][granPeriod][endTime]" => "time"}}
mutate {rename => { "[theXML][measInfoId]" => "measInfoId"}}
mutate {rename => { "[theXML][measType][content]" => "Type"}}
mutate {rename => { "[theXML][measValue][measObjLdn]" => "ObjLdn"}}
mutate {rename => { "[theXML][measValue][r][content]" => "Value"}}}

(Christian Dahlqvist) #4

The fact that 98% of CPU is used does not necessarily mean that only one processing thread is active. Logstash can only process as fast as the downstream are able to accept the data, so it is useful to verify that Elasticsearch is not the bottleneck here. What does CPU usage and disk I/O and iowait look like on your Elasticsearch cluster? How many indices and shards are you actively writing into? Do you see any errors or warnings in the Elasticsearch logs? What is the specification of your Elasticsearch cluster and what throughput are you seeing?

If this is not the bottleneck, have you verified that you have enough data coming in to keep all pipeline threads busy?

(Mohamed) #5

I have 3 VMs, in each one there is one instance of logstash, elasticsearch and kibana.
The first machine : 8 vCPUs, 48 GO Memory and 200GB HDD ( I am sending data to logstash running here and send data to the third machine, and my elasticsearch is a master)
The second one : 4 vCPUs,48 GO Memory and 200GB HDD ( another instance of ES running here as a master)
The third machine: 4 vCPUs,48 GO Memory and 200GB HDD ( another instance of ES running here node.master: false and node.data: true)

My filebeat shipped data from 3 servers and send it to logstash ( VM1)

I used this config for another filebeat which shipped data from only one server and it works fine but when I have 3 servers I am loosing a lot of data on my ES.

As you see there is only 3 indexes only because the output of my logstash is the name of the server.

How can I check that logstash receiving enough data ? because in this case I send 3 times more data compared to my first test

(Christian Dahlqvist) #6

Having 2 master-eligible nodes in a cluster is bad practice as you should always aim to have at least 3. I would therefore recommend you make all 3 nodes master eligible and also make sure you set minimum_master_nodes to 2.

It would be great if you also could answer my other questions, e.g. about disk I/O and seen throughput.

(Mohamed) #7

Do you mean that ?

(Christian Dahlqvist) #8

Is this during indexing? is this on the node being indexed into?

(Mohamed) #9

I restart ELK to check the value, I am getting an iowait between 0% and 2% max.

But I am getting something weird : Capture5

(Christian Dahlqvist) #10

It may be that your VMs are overcommitted with respect to CPU and/or memory, so the VMs do not actually have access to the resources you have assigned. That could certainly explain why you do not get higher utilisation, and is something you should check.

(Mohamed) #11

Problem resolved: I am using now another instance with ubuntu and not debian . This instance is running on openstack, so it was not a problem of hardware. I don't know why the instance with debian image couldn't use more than one CPU and the ubuntu instance is running very well.

(Mohamed) #12

Hello, just for info, I found out the problem. In fact I was using an old version of kernel, I upgraded it and logstash was able to use more than one CPU. But the systemd-journal was using 90-100% of one CPU even after the upgrade. There was another problem in my journalctl.conf I edited it and now my ELK is working very well.