Logstash -> Kafka output preferred settings

Are there any preferred, non-default settings for using the kafka output in logstash? Any specific Kafka-side settings that should be checked as well?

I'm running into the issue of logstash seemingly not being able to keep up with the load when all it's doing is listening on TCP (using json_lines codec) and outputting to Kafka (snappy compression). After a few seconds to a few minutes of running smoothly, logstash "chokes" and stops processing anything. No errors, no logs, nothing. CPU drops down to almost 0.

I'm certain it's an issue with the Kafka output because if I output to File or to Elasticsearch everything works fine.

Any help would be appreciated.

Thanks,
Jim

What version Logstash are you running? Generally the plugin uses acks = 1
which implies that one broker acknowledges the request. If it is blocking
on send the broker slow to respond. Perhaps trying to run a Kafka
performance test using the Kafka tools to test for bottlenecks. How many
logs per second are you talking about and how big?

We are running the latest logstash 2.2.1. I've tried it with acks = 0, 1, and "all" and we've run into the same issue. It now appears that the issue is that it can't handle the initial load caused by the backup of messages. We have 2 logstash instances sitting behind a load balancer which seems to queue up the messages that fail to be processed by logstash once it "chokes". So when we restart logstash and communication opens back up, we get a huge influx of messages that the kafka output can't handle (but the File and Elasticsearch plugins can) so it chokes eventually and we're back at square one.

If I get rid of the backup and only send the real-time messages, the kafka output seems to handle it fine. It's just the initial catch-up that it can't handle. But in production I can't assure that we'll never have to restart logstash or get a build-up of messages like that.

Is there an easy solution to handle this?

Can you share your config and describe you pipeline in more detail,
especially what is feeding Logstash? Is the load balancer queuing the
messages as well?

I'm surprised that the process just fails to continue to deliver messages.
When you say the file and Elasticsearch plugins can handle the restart does
that mean you are just writing the messages to a file and removed the Kafka
output?

The cpu consumption dropping to zero implies that the pipeline is stalled.

Maybe try 2.1.X and see if it still fails. 2.2 was a pretty significant
release for the internal workings of Logstash.
malonej7 http://discuss.elastic.co/users/malonej7
February 17

We are running the latest logstash 2.2.1. I've tried it with acks = 0, 1,
and "all" and we've run into the same issue. It now appears that the issue
is that it can't handle the initial load caused by the backup of messages.
We have 2 logstash instances sitting behind a load balancer which seems to
queue up the messages that fail to be processed by logstash once it
"chokes". So when we restart logstash and communication opens back up, we
get a huge influx of messages that the kafka output can't handle (but the
File and Elasticsearch plugins can) so it chokes eventually and we're back
at square one.

If I get rid of the backup and only send the real-time messages, the kafka
output seems to handle it fine. It's just the initial catch-up that it
can't handle. But in production I can't assure that we'll never have to
restart logstash or get a build-up of messages like that.

Is there an easy solution to handle this?

Visit Topic
http://discuss.elastic.co/t/logstash-kafka-output-preferred-settings/41941/3
or reply to this email to respond
Previous Replies
Joe_Lawson http://discuss.elastic.co/users/joe_lawson Logstash Plugins
Community Maintainer
February 17

What version Logstash are you running? Generally the plugin uses acks = 1
which implies that one broker acknowledges the request. If it is blocking
on send the broker slow to respond. Perhaps trying to run a Kafka
performance test using the Kafka tools to test for bottlenecks. How many
logs per second are you talking about and how big?

Visit Topic
http://discuss.elastic.co/t/logstash-kafka-output-preferred-settings/41941/3
or reply to this email to respond

To stop receiving notifications for this particular topic, click here
http://discuss.elastic.co/t/logstash-kafka-output-preferred-settings/41941/unsubscribe.
To unsubscribe from these emails, change your user preferences
http://discuss.elastic.co/my/preferences