Logstash consuming messages slow

Hi All
I have recently setup ELK stack as belows.
Kibana <-- ElasticSearch <-- Logstash <-- Redis <-- Filebeat.
Here my Filebeat is producing log events faster than Logstash can consume..resulting in pile up on redis..which results in redis consuming all the memory. Can anyone suggest how can logstash be speed up. I am running with 20 threads i started with 4 threads and started increasing in multiples of 4, but not sure if continuously increasing number of threads will bear me any fruit though.

input {
redis {
host => "some host"
port => "port"
type => "redis-input"
data_type => "list"
key => "filebeat"
}
}

filter {
grok {
match => [ "message", "^%{TIME:timestamp}\s+%{LOGLEVEL:level}%{GREEDYDATA}\u0001somesource=%{GREEDYDATA:inst}\u0001someText%{GREEDYDATA}" ]
match => [ "message", "^%{TIME:timestamp}\s+%{LOGLEVEL:level}" ]
break_on_match => true
}
}

output {
elasticsearch {
hosts => ["some host:port"]
manage_template => false
index => "someindex-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
#stdout { codec => rubydebug }
}

I also have one more issue on the filebeat side. I am getting issue as:

ERR Fail to publish event to REDIS: write tcp sourceHost:sourcePort->redisHost:redisPort: i/o timeout

But this one is intermittent. I have checked on my network side. Everything looks good. I am able to see log events being send over to ES from FileBeat without any issues but intermittently I am seeing this issue in filebeat logs.

1 Like

my situation is the same, logstash as a central log processor can't digest redis logs fast enough..
i just set ONE filebeat to send logs to redis... filebeat is so efficent that logstash can't keep up with.

look forward some tips..

I am doing POC with Ingest Node to replace Logstash and redis all together..maybe you too can do it and see if it works for you and if successful probably share the results as well.

could you explain what POC is and is there a setup link for knowing it?

Apparently I am trying to get ES itself to extract information from the log messages instead of Logstash doing it. So far no success as Ingest nodes requires ES 5.x version which requires Filebeat 5.x version and we cannot go ahead with filebeat 5.x version as it does not has support for symlinks. So thats where I am for now. I hope someone can help over here to progress

It confuses me that how could you replace logstash (or other agent) with es?
As i know the role of es is quite different from logstash is.
You mean this:
log -> es instead of log -> logstash -> es ?

maybe es 5.0 support that collection function ?

No I meant.. Filebeat --> ES --> Kibana. Here ES would be doing both parts consumption as well as processing of events to extract info.

well, you cut off redis, filebeat directs logs to es.
has logs loss happened? because i found that fb is very very efficient on collecting logs.. so i have that concern.

well that's what POC is all about whether log loss will happen or not..But as I said I was not able to make any progress on that..It would be worth if you gave it a shot and probably share results in this forum

ok, then i'll give a try.

Is there no way to fix this natively? I'm having the same issue. My logstash is too slow reading messages from the redis so on my graphs I see sometimes gaps of several minutes to hours

This is what my input looks like:

input {
    redis {
        host => "192.168.250.130"
        port => "6379"
        data_type => "list"
        key => "varnish"
        threads => 16
    }
}

I'm also 100% sure this is logstash's fault since I've just turned off the forwarders to my redis that write 1 million entries per hour and now all my graphs look perfectly fine.

Edit: Apparently I may have fixed it with this output configuration when writing to redis:

redis {
    host => "{{ ip }}"
    port => "6379"
    data_type => "list"
    key => "varnish"
    batch => true
    batch_events => 500
    batch_timeout => 60
    workers => 8
}

From what I've reading apparently the sweet spot is 500 batch_events in the output and lots of threads on the input

Hi Mate.. I found a nice work around for this.. Skip Redis altogether, Redis is pain in the rear and its quite heavy as well. Instead what I did is I started setting up Logstash Clusters. So my Filebeat was sending messages to logstash clusters, it was quite easy to find out how many logstash nodes were needed by incrementing the number of nodes gradually until you see no breaks. Also I started running ES as single node standalone process. But I had divided the ES into using multiple indexes. This indexes were divided logically based on certain factors specific to my requirement. So the idea is divide your data across indexes and then configure logstash to dump data in specific indexes only. I also did not had to use multiple threads 4 threads per node of logstash worked smooth for me. A periodic restart of the whole ELK stack keeps me smooth. Do give it a try and see if the above helps.

The thing is I use logstatsh itself as my shipper on some of my servers, I might try to set up a TCP or UDP output on the shippers and configure my main logstash to listen for those as input

Use Filebeat to ship logs from the source and then let Filebeat send those log messages to Logstash cluster Please note each source should have its own Logstash cluster and then Logstash clusters should send those messages to ES, also make sure to send them to different indexes.