Decreased logstash/elasticsearch performance with load-balanced logstash nodes

I've been running an ELK stack on a single node for the past year to ingest our application server logs. The load is consistently quite high on this bare-metal box. I was recently given a smaller baremetal server to install an additional instance of logstash on in an attempt to reduce the load on the single node.
I set up logstash with the exact same config on the new server and configured filebeat on the application servers just to send logs to the new box(with just logstash on it). Upon restarting filebeat to use these instances, I noticed that elasticsearch was only ingesting about 10% of the messages it had before. I have no idea where the other 90% went. I thought that perhaps the new box was too slow so maybe the messages were being queued. I then configured filebeat on all the application servers to use load balancing. I encountered the same result as above.

Can someone point me in the right direction to accurately diagnose and fix this?

Thanks!

What's the system load situation on the server? Have you looked into using the Logstash monitoring API to find the bottleneck?

The single node it's running on right now is a 14-core system (1.7 ghz with 128 gb's memory) and the load at peak usually hovers around 40 or so. This is when logstash is running one instance on the 14-core system. However, when I enable logstash on the 'slower' secondary server, my load on my 14-core system goes down to 8 or 9...I haven't used the logstash api yet (still on 5.6.5) but when I tracking system network performance, I have a steady stream of logs coming in to the 14-core system when that is the only instance of logstash running. However, when enabled on the secondary, I see traffic spikes where it looks like it sends a 'spike' of data every ten minutes and then backs off....

Is there a way to monitor/graph the logstash api without x-pack?

Is there a way to monitor/graph the logstash api without x-pack?

Well, you should be able to dump the monitoring API response into ES, e.g. using Logstash itself.

Good idea. I setup a poller for http://localhost:9600/_node/stats/pipeline. However, when I look it
in Kibana, the fields I'm interested in (input,filter,output) metrics are not created as fields because they're nested. I tried using a 'mutate/rename' filter to flatten those fields but no luck. I'm just interested in graphing the values of my logstash filters over the long-term with Kibana so is there an easier to way to do this without x-pack and using the polling method above?

Also, how often should I be polling the logstash api so as to not receive duplicated results?

Bump

The problem is everything is an array. Kibana will let you visualize something like Average pipelines.<name>.plugins.inputs.events.out, but you really want to be able to look at individual filters.

That could be done

    mutate { remove_field => [ "[pipelines][.monitoring-logstash]" ] }
    ruby {
        code => '
            event.get("[pipelines]").each { |k1, v1|
                p = v1["plugins"]
                p["filters"].each { |v2|
                    id = v2["id"]
                    event.set("filter-#{id}", v2)
                }
                p["inputs"].each { |v2|
                    id = v2["id"]
                    event.set("input-#{id}", v2)
                }
                p["outputs"].each { |v2|
                    id = v2["id"]
                    event.set("output-#{id}", v2)
                }
            }
        '
    }

But then I end up with nearly 4000 fields in the index, and handling fields like Average filter-0088b8fd258ba718ebf4af6005e661f8883f37cdc133f003965ed07c0f7dc8f2.events.duration_in_millis is a bit awkward.

The approach might work if you only needed to do it for a small number of filters, especially if you set all the filter ids in your config so that you get names like Average filter-ihs-geoip.events.duration_in_millis.

Great...Thank you. I will try this code shortly.

I tried your code but I keep on seeing this error in my logs:

Ruby exception occurred: undefined method `each' for nil:NilClass

I already modified [pipelines] to [pipeline] as per my output from logstash but with the same error.

I'm not too worried about the number of fields as I 'id' all my filters as I write them so it should be
fairly easy to navigate in kibana.

Then your http poller probably does not look like mine.

http_poller {
    urls => {
        stats => {
            url => "http://localhost:9600/_node/stats/pipelines"
            headers => {
              Accept => "application/json"
            }
        }
    }
    schedule => { cron => "*/30 * * * * *" }
    codec => json
}

I get an event that looks like this, and the event.get is pulling out that pipelines field.

{
        "name" => "host.name",
    "@version" => "1",
   "pipelines" => {
               "somepipeline" => {
         "events" => {
                                       "in" => 0,
                                      "out" => 0,
                       "duration_in_millis" => 0,
            "queue_push_duration_in_millis" => 0,
                                 "filtered" => 0
        },
        "reloads" => {
                          "failures" => 0,
                        "last_error" => nil,
            "last_success_timestamp" => nil,
                         "successes" => 0,
            "last_failure_timestamp" => nil
        },
        "plugins" => {
             "inputs" => [
                [0] {
                      "name" => "kafka",
                    "events" => {
                        "queue_push_duration_in_millis" => 0,
                                                  "out" => 0
                    },
                        "id" => "e5b787cfca9bb0be67f41a99d5474794d5cf83ad58240d87322118d2f94e822b"
                }
            ],

What version of logstash are you using? I'm using logstash 5.6.5. I get an invalid url with one you specified but when I change it to pipeline, I get the "Ruby exception occurred: undefined method `each' for nil:NilClass" so I suspect that you're using a different version and they changed the api ever so slightly.

Yeah, I am on 6.3.0

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.