Logstash crashes after sometime and freezes the system


(imad) #1

I am using logstash to feed elasticsearch indices data from my couchdb. I have about 20 indices. At this time I have very simple setup. All my data is in different databases in couchDB, e.g., users has its own db, roles has its own db and so on and I have indices for each of the db.
For some reason, logstash freezes my system after throwing some errors after running for 10 minutes or so. The errors are related to plugin failure for few indices. I was not able to copy the errors due to freeze. Please advice, how can I resolve the freeze. I'll post errors as soon as I copy those.

Many thanks.


(imad) #2

Here is the error:
?[31mA plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::CouchDBChanges db=>"audits", host=>"127.0.0.1", por
t=>5984, sequence_path=>"seq_files\audits_couchdb_seq", tags=>["audits"], debug
=>false, codec=><LogStash::Codecs::Plain charset=>"UTF-8">, secure=>false, passw
ord=>, heartbeat=>1000, keep_revision=>false, ignore_attachments=>true
, always_reconnect=>true, reconnect_delay=>10>
Error: Unable to establish loopback connection {:level=>:error}?[0m
?[31mA plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::CouchDBChanges db=>"links", host=>"127.0.0.1", port
=>5984, sequence_path=>"seq_files\links_couchdb_seq", tags=>["links"], debug=>f
alse, codec=><LogStash::Codecs::Plain charset=>"UTF-8">, secure=>false, password
=>, heartbeat=>1000, keep_revision=>false, ignore_attachments=>true, a
lways_reconnect=>true, reconnect_delay=>10>
Error: Unable to establish loopback connection {:level=>:error}?[0m
?[31mA plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::CouchDBChanges db=>"cameras", host=>"127.0.0.1", po
rt=>5984, sequence_path=>"seq_files\cameras_couchdb_seq", tags=>["cameras"], de
bug=>false, codec=><LogStash::Codecs::Plain charset=>"UTF-8">, secure=>false, pa
ssword=>, heartbeat=>1000, keep_revision=>false, ignore_attachments=>t
rue, always_reconnect=>true, reconnect_delay=>10>
Error: Unable to establish loopback connection {:level=>:error}?[0m


(Jay Greenberg) #3

Thanks @imad,

Are you able to post your logstash configuration?


(imad) #4

Here is how my config file looks like:

input {
# 1 attributes
couchdb_changes {
db => "attributes"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\attributes_couchdb_seq"
tags => ["attributes"]
}
# 2 audits
couchdb_changes {
db => "audits"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\audits_couchdb_seq"
tags => ["audits"]
}
couchdb_changes {
db => "clips"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\clips_couchdb_seq"
tags => ["clips"]
}
couchdb_changes {
db => "events"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\events_couchdb_seq"
tags => ["events"]
}
couchdb_changes {
db => "folder"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\folder_couchdb_seq"
tags => ["folder"]
}
couchdb_changes {
db => "icons"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\icons_couchdb_seq"
tags => ["icons"]
}
couchdb_changes {
db => "menus"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\menus_couchdb_seq"
tags => ["menus"]
}
couchdb_changes {
db => "users"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\users_couchdb_seq"
tags => ["users"]
}
couchdb_changes {
db => "roles"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\roles_couchdb_seq"
tags => ["roles"]
}
couchdb_changes {
db => "cameras"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\cameras_couchdb_seq"
tags => ["cameras"]
}
couchdb_changes {
db => "devices"
host => "127.0.0.1"
port => 5984
sequence_path => "seq_files\devices_couchdb_seq"
tags => ["devices"]
}
}
output {
if "attributes" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "attributes_index"
protocol => "http"
port => 9200
}
}
if "audits" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "audits_index"
protocol => "http"
port => 9200
}
}
if "clips" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "clips_index"
protocol => "http"
port => 9200
}
}
if "events" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "events_index"
protocol => "http"
port => 9200
}
}
if "folder" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "folder_index"
protocol => "http"
port => 9200
}
}
if "icons" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "icons_index"
protocol => "http"
port => 9200
}
}
if "menus" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "menus_index"
protocol => "http"
port => 9200
}
}
if "users" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "users_index"
protocol => "http"
host => '127.0.0.1'
port => 9200
}
}
if "roles" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "roles_index"
protocol => "http"
host => '127.0.0.1'
port => 9200
}
}
if "cameras" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "cameras_index"
protocol => "http"
host => '127.0.0.1'
port => 9200
}
}
if "devices" in [tags] {
elasticsearch {
document_id => "%{[@metadata][_id]}"
host => "127.0.0.1"
index => "devices_index"
protocol => "http"
host => '127.0.0.1'
port => 9200
}
}
}


(Jay Greenberg) #5

@imad,

My initial feeling is simply that logstash cannot connect to the CouchDB server @ 127.0.0.1. Is it possibly that the CouchDB server stops responding or will not take new connections at some point?


(imad) #6

No, I don't think so, since we are querying couchdb all the time.


(Jay Greenberg) #7

If things crash after only 10 minutes, consistently, then we must be able to track the reason.

  1. Are you running iptables or any firewall that would prevent localhost connections in any way?

  2. Please run the following periodically (assuming you are running in posix enviroment) from start until crash. Let's see if the number of connections remains constant.

    netstat -an | grep 127.0.0.1:5984 | wc -l

  3. Also, you can run logstash with --debug, in order to capture any other information that might lead to the solution.


(imad) #8

Phaedrus thanks for your reply.

Actually the crash is not consistently happening after 10 mins, its arbitrary. I ran netstat -an command consistently and only saw same number of connections just few minutes before crash.
I can share the log file that I captured by running logstash with --debug flag. But I didn't see any error in the log file for crash. Is there a way to upload it or send it to you?


(Jay Greenberg) #9

You can upload the logs using gist, pastebin, or even dropbox, depending on the size.


(imad) #10

here is the gist link for log file : https://gist.github.com/imadulhaque/4bfdba5d99cdfdb11b6d. Let me know if it doesn't work.

Thanks.


(Jay Greenberg) #11

When the system freezes, can you describe what happens? Are you able to SSH into the host? Must you reboot the system manually? There must be some indication from an operational perspective as to what is causing the lockup. For example, is there memory or CPU contention at that time?


(imad) #12

I don't see any CPU contention, looks like enough memory is available when system freezes. I can't do anything when system freezes until I manually press restart button.


(imad) #13

Any help on this. Just an fyi, I am running elasticsearch and logstash on same system that has my couchdb.


(imad) #14

@PhaedrusTheGreek could it be that when a document is updated in couchdb, for some reason logstash fails to update that document in its corresponding index in elasticsearch.

I have uploaded a crash dump at: https://gist.github.com/imadulhaque/6d405fb57042182e6166.
The last message that I see is message=>"failed action with response of 400, dropping action: ["index", {:_id=>"51f4314a442fbf205072f60cdb0309d1", :_index=>"users_index", :_type=>"logs", :_routing=>nil},.............
Could this be the reason for logstash freeze and eventually system freeze?
Any help would be greatly appreciated.

Thanks.


(Jay Greenberg) #15

@imad

Running CouchDB, Logstash & Elasticsearch all on the same system may be related to the problem, but I think that fact confuses the issue. I would recommend splitting the 3 functions out against 3 different machines . By doing this, we may gain more insight into where the failure is occurring.

Also, I am interested to know the JVM version(s), OS, Logstash & Elasticsearch Versions.

Thanks


(imad) #16

In our first release we would be running everything on same machine with enough memory for them.
So do you mean, even elasticsearch + logstash on same system is also not recommended?

java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)

OS is windows 7 64 bit.
logstash is1.5.4
elasticsearch is 1.6.2


(Jay Greenberg) #17

I see some discrepancies between your environment and the supported platforms documentation. Please see here:

https://www.elastic.co/support/matrix

It's always better to run separate components on dedicated systems, so that they don't contend for resources with each other. Further, it will help us isolate the problem.


(imad) #18

I updated my jdk and jre to latest version. Then downloaded latest elasticsearch, unzipped it and tried to install it as service but its failing but it runs without any error. Any ideas whats up with the 'service install' thing. This is the error that I get:

Installing service : "elasticsearch-service-x64"
Using JAVA_HOME (64-bit): "C:\Program Files\Java\jdk1.8.0_60"
Failed installing 'elasticsearch-service-x64' service


(imad) #19

Just an fyi, I tried latest version of jre and jdk with latest version of elasticearch 1.7.2 and logstash 1.5.4 and still got the same error with system freeze. I'll run elasticsearch and logstash on different system to see if it make any difference.

Here are the errors, but no its for different couchdb and elastic search index.
{:timestamp=>"2015-09-14T13:23:40.330000-0400", :message=>"A plugin had an unrecoverable error. Will restart this plugin.\n Plugin: <LogStash::Inputs::CouchDBChanges db=>"preferences", host=>"127.0.0.1", port=>5984, sequence_path=>"seq_files\\preferences_couchdb_seq", tags=>["preferences"], debug=>false, codec=><LogStash::Codecs::Plain charset=>"UTF-8">, secure=>false, password=>, heartbeat=>1000, keep_revision=>false, ignore_attachments=>true, always_reconnect=>true, reconnect_delay=>10>\n Error: Unable to establish loopback connection\n Exception: IOError\n Stack: org/jruby/RubyIO.java:3682:in select'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/protocol.rb:143:inrbuf_fill'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/protocol.rb:141:in rbuf_fill'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/protocol.rb:122:inreaduntil'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/protocol.rb:132:in readline'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:2779:inread_chunked'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:2759:in read_body_0'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:2719:inread_body'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/bundle/jruby/1.9/gems/logstash-input-couchdb_changes-1.0.0/lib/logstash/inputs/couchdb_changes.rb:147:in run'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:1331:intransport_request'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:2680:in reading_body'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:1330:intransport_request'\norg/jruby/RubyKernel.java:1274:in catch'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:1325:intransport_request'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:1302:in request'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/bundle/jruby/1.9/gems/logstash-input-couchdb_changes-1.0.0/lib/logstash/inputs/couchdb_changes.rb:145:inrun'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:746:in start'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/jruby/lib/ruby/1.9/net/http.rb:557:instart'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/bundle/jruby/1.9/gems/logstash-input-couchdb_changes-1.0.0/lib/logstash/inputs/couchdb_changes.rb:141:in run'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.4-java/lib/logstash/pipeline.rb:177:ininputworker'\nE:/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.4-java/lib/logstash/pipeline.rb:171:in `start_input'", :level=>:error, :file=>"/Softwares/Web Related/logstash-1.5.4/logstash-1.5.4/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.4-java/lib/logstash/pipeline.rb", :line=>"182", :method=>"inputworker"}


(imad) #20

@PhaedrusTheGreek thanks for all your help. Looks like there was something wrong on my system for elasticsearch to run as service. I have installled latest jre and jdk on one of my other systems and using latest elasticsearch and logstash on it. The whole system seems to works fine so far, whereas earlier it would freeze in about 20 to 30 mins. The system that I have with new setup has more memory as well so it means system freeze was most probably due to lack of memory.