Logstash throughput drops to 0 with no apparent memory problems and 0 cpu usage

David_McClain · March 22, 2017, 2:21pm

Logstash Version: 2.4
Elasticsearch Version: 2.3.4
Redis version: 2.19

I have an indexing setup where Logstash reads from Redis, parses the data, and indexes into Elasticsearch (fairly standard). I monitor the list size of Redis as a primary health check.

Starting about 2 days ago, Logstash will seemingly randomly stop pulling data from Redis and inserting it into Elasticsearch.

In the past, I had the issue where Elasticsearch did not have enough memory to handle all of the data I was feeding it, and this resulted in a similar behavior by Logstash, but in that case, I was getting decreased throughput whereas now I'm getting 0 throughput. I did however check the data/memory ratio and the JVM heap behaviors for all of my nodes in ES, and they are all healthy sawtooths - garbage collecting whenever they hit approx. 70%. In addition to this, since I have multiple Logstash indexers (on different servers) feeding Elasticsearch, they won't all exhibit this 0 throughput behavior at the same time. So it doesn't seem to be Elasticsearch as the problem this time.

Checking the perf stats of Logstash doesn't seem to reveal much (to me at least).

Top will show the offending Logstash processes between 0.0% - 0.3% cpu usage.
Running the command:
redis-cli lrange logstash 0 0
Shows that the leading message to be read from Redis is not being removed, where a healthy Logstash will result in that same command showing a different Redis item each time (because Logstash is constantly removing messages).
Memory: Logstash is set to have a default heap of 1G, and heap usage for offending Logstash instances (while having 0% cpu) is not much different from ones that are working normally. I usually see between 600-700m of memory being used with no real growth - to me that means that memory isn't a constraint here.
Logs: In default logging mode, there are no ERROR logs being generated when this happens. When I turned on Debug, I couldn't see anything other than the flood of messages that are generated when Logstash is working properly.

--

I prior to this, Logstash worked near perfectly, and no changes have been made to the indexer in the past 2 days.

Any help would be great. I'm not sure where to look anymore. The only workaround I have is to restart Logstash when it exhibits this behavior, and about 70% of the time the restarted Logstash will begin reading messages again, only to stop sometime in the near future. (This morning we restarted a Logstash, and 2 hours later, it stopped reading.)

Christian_Dahlqvist · March 23, 2017, 1:54pm

That version of Logstash is quite old. Without having any great experience troubleshooting Redis and the reds input plugin, I would recommend trying the latest version to see if it behaves the same way. This should be compatible with your 2.3.4 ES cluster.

David_McClain · March 23, 2017, 2:16pm

Latest version of Logstash? (5.2.2)

I read in the docs that Logstash 5.x wasn't compatible with ES 2.x.

Christian_Dahlqvist · March 23, 2017, 2:25pm

Where in the docs did you read that?

David_McClain · March 23, 2017, 2:30pm

https://www.elastic.co/guide/en/logstash/current/upgrading-logstash-5.0.html

Although we make great efforts to ensure compatibility, Logstash 5.0 is not completely backwards compatible. As noted in the Elastic Stack upgrade guide, Logstash 5.0 should not be upgraded before Elasticsearch 5.0. This is both practical and because some Logstash 5.0 plugins may attempt to use features of Elasticsearch 5.0 that did not exist in earlier versions.

Christian_Dahlqvist · March 23, 2017, 2:33pm

As that contradicts what is in the compatibility matrix, I will leave it to someone from the Logstash team to clarify.

pierhugues · March 23, 2017, 2:48pm

I think the doc might be misleading in that case, the output from LS 5.0 is compatible with ES 2.x. We actually have an integration suite that run against different version of ES:

- INTEGRATION=true ES_VERSION=1.7.6 TEST_DEBUG=true
- INTEGRATION=true ES_VERSION=2.4.4 TEST_DEBUG=true
- INTEGRATION=true ES_VERSION=5.2.1 TEST_DEBUG=true
- INTEGRATION=true ES_VERSION=master TEST_DEBUG=true
```

What the text you post mean is:

> Although we make great efforts to ensure compatibility, Logstash 5.0 is not completely backwards compatible.

We will make breaking changes in the config options for plugins, API that author uses can change.

>  This is both practical and because some Logstash 5.0 plugins may attempt to use features of Elasticsearch 5.0 that did not exist in earlier versions.

Some community plugins or user specific configuration can break when you update.

As anything we always recommend that you test your usecase before upgrading anything.

David_McClain · March 23, 2017, 2:59pm

So is the answer to my problem simply 'upgrade LS and see if that fixes it' ?

Christian_Dahlqvist · March 23, 2017, 3:05pm

That is usually a good first step as you are on an old version. If this does not solve the problem, you may also want to increase Logstash logging to see if anything shows up around the time it stops reading, but you can naturally do that without upgrading as well.

system · April 20, 2017, 3:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issues with logstash sustained throughput Elasticsearch	2	468	July 6, 2017
Logstash abysmal performance from 2.2.0 to 2.3.1 Logstash	6	1279	July 6, 2017
REDIS input stops working? Logstash	4	1417	July 6, 2017
Logstash consuming messages slow Logstash	14	3634	July 6, 2017
Logstash at 100% CPU, slow to process Redis queue to Elasticsearch Logstash	3	1093	July 6, 2017

Logstash throughput drops to 0 with no apparent memory problems and 0 cpu usage

Related topics