Logstash abysmal performance from 2.2.0 to 2.3.1

Hi guys,

We are running logstash 2.2.0 (against elasticsearch 2.3.1) and it can index >1.2mill docs/minute.
If we then upgrade to logstash-2.3.1 - it peaks at 200k docs/min.. 1/6 of the performance.

There's nothing in the errorlog of either elasticsearch, nor logstash.

Any ideas as to what I could try to remedy this, are very welcome.. (should I test with 5.0 alpha?)

Config is this:
input {
redis {
host => "127.0.0.1"
# Remember that type does NOT overwrite trype from shipper!
type => "redis-input"
# these settings should match the output of the agent
data_type => "list"
key => "logstash"
codec => json
threads => 8
}
redis {
host => "127.0.0.1"
type => "netflow"
data_type => "list"
key => "pet1year"
codec => json
threads => 2
}
redis {
host => "ytes02.example.org"
# Remember that type does NOT overwrite trype from shipper!
type => "redis-input"
# these settings should match the output of the agent
data_type => "list"
key => "logstash"
codec => json
threads => 8
}
redis {
host => "ytes02.example.org"
type => "netflow"
data_type => "list"
key => "pet1year"
codec => json
threads => 2
}
}

filter {
#choose index
#1.5.0 only feature
if [type] == "cnrdhcp" {
mutate { add_field => { "[index]" => "cnrdhcp-%{+YYYY.MM.dd}" } }
} else if [type] == "netflow" {
mutate { add_field => { "[index]" => "netflow-%{+YYYY.MM.dd}" } }
} else if [type] == "akamai_access_logs" {
mutate { add_field => { "[index]" => "cdn_access_logs-%{+YYYY.MM.dd}" } }
} else if [type] == "cdn_access_logs" {
mutate { add_field => { "[index]" => "cdn_access_logs-%{+YYYY.MM.dd}" } }
} else if [type] == "cdn_content_logs" {
mutate { add_field => { "[index]" => "cdn_content_logs-%{+YYYY.MM.dd}" } }
} else if [type] == "mpf_arkiv" {
mutate { add_field => { "[index]" => "mpf_arkiv-%{+YYYY.MM}" } }
} else {
mutate { add_field => { "[index]" => "logstash-%{+YYYY.MM.dd}" } }
}
#generate message_id if its not present..
if [message_id] {
mutate { add_tag => "hasmessage_id" }
} else if [message] {
ruby {
init => "require 'digest/sha1'"
code => "event['message_id'] = Digest::SHA1.base64digest(event['message'])"
}
} else {
#really broken input.. use timestamp as id for now.. - we should never land here
mutate { add_field => { "[@metadata][id]" => "%{@timestamp}" } }
}
}

output {
elasticsearch {
codec => plain {
charset => 'UTF-8'
}
hosts => "127.0.0.1:9200"
index => "%{[index]}"
manage_template => false
document_id => "%{[message_id]}"
}
statsd {
host => "localhost"
port => 8125
sender => "ytes01"
namespace => "servers"
increment => "logstash.processing"
}

}

In version 2.0.5 of the Redis Input we enabled batching by default (size 125). I think the multiple threads are stomping on each each other.

You can try one of:

  1. removing the extra threads - allows up to 125 'events' to be fetched in one thread per input. 125 was chosen because it works well with the pipeline batch count.
  2. "revert to previous behaviour" - set batch count to 1 - batch_count => 1

I also suspect that batch_count > 1 with threads > 1 maybe crashing one of the 8 inputs (threads => 8) and it is restarted.

This can be seen quite clearly using --debug.
Look for:
A plugin had an unrecoverable error. Will restart this plugin. with Plugin: <LogStash::Inputs::Redis...

We tried tweaking the redis input settings with no change. We then tried setting output to not putting to ES.. and everything ran quickly - which made us conclude the input wasn't the issue - it was the output.

We added workers => 8 to elasticsearch output - and this increased throughput to the same as with 2.2.0. We also tried higher numbers (16 or 20) with no improvements.

Thank you for the input :slight_smile:

Please confirm your solution (for future readers of this discussion):
e.g. You now have no threads config settings in the redis inputs.

You are advised to read the upgrade to 2.2 blog post and the New Pipeline and Outputs section in particular.

At the time of writing this, there remains an ongoing issue the the elasticsearch output with output workers > 1 and connections not pooled.

yes. No threads not batch_count set on redis input.. workers = 8 in elasticsearch output.