Logstash losing connection to Elasticsearch


(Kelv1n) #1

Hi Guys

I'm tried to setup a ELK cluster using the latest 2.0 releases, after a period of about 15-20 minutes it seems Logstash can no longer connect to Elasticsearch and the Logstash logs get flooded with the messages below. Initially I thought the issue was with the cluster, so I started with a fresh single server install on Ubuntu 14.04 and the problem quickly started reoccurring -

{:timestamp=>"2015-10-31T23:20:37.468000+0000", :message=>"Attempted to send a bulk request to Elasticsearch configured at '[\"http://localhost:9200/\"]', but an error occurred and it failed! Are you sure you can reach elasticsearch from this machine using the configuration provided?", :client_config=>{:hosts=>["http://localhost:9200/"], :ssl=>nil, :transport_options=>{:socket_timeout=>0, :request_timeout=>0, :proxy=>nil, :ssl=>{}}, :transport_class=>Elasticsearch::Transport::Transport::HTTP::Manticore, :logger=>nil, :tracer=>nil, :reload_connections=>false, :retry_on_failure=>false, :reload_on_failure=>false, :randomize_hosts=>false}, :error_message=>"Cannot find Serializer for class: org.jruby.RubyObject", :error_class=>"JrJackson::ParseError", :backtrace=>["com/jrjackson/JrJacksonBase.java:83:in `generate'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jrjackson-0.3.6/lib/jrjackson/jrjackson.rb:59:in `dump'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/multi_json-1.11.2/lib/multi_json/adapters/jr_jackson.rb:20:in `dump'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/multi_json-1.11.2/lib/multi_json/adapter.rb:25:in `dump'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/multi_json-1.11.2/lib/multi_json.rb:136:in `dump'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.14/lib/elasticsearch/api/utils.rb:102:in `__bulkify'",  <SNIP>
{:timestamp=>"2015-10-31T23:20:37.469000+0000", :message=>"Failed to flush outgoing items", :outgoing_count=>1, :exception=>"JrJackson::ParseError", :backtrace=>["com/jrjackson/JrJacksonBase.java:83:in `generate'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jrjackson-0.3.6/lib/jrjackson/jrjackson.rb:59:in `dump'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/multi_json-1.11.2/lib/multi_json/adapters/jr_jackson.rb:20:in `dump'", <SNIP>

There are no errors in the Elasticsearch logs, and Kibana is showing a status of Green. The only visual indicator of a problem is the logs stop appearing in Kibana.

My setup is quite basic, Logstash is receiving Syslog on ports 1514 and 1515. I've done it like this as I want to create a self-contained config file for each device (with input, filter and output). I'm only testing this with 2 config files at the moment which are (There is probably some better way of doing it, but I'm still learning)

Fortigate.conf -

    input { 
        syslog {
            type => "syslog"
            port => 1514
            add_field => ["parser", "fortigate"]
        }
    }
    filter {
    
        if [parser] == "fortigate" {
    
            grok { match => [ "message", "<(?<ruleID>.*)>(?<msg>.*)" ] }
            kv { source => "msg" }
    
            geoip {
                source => "srcip"
                target => "src_geoip"
                database => "/etc/logstash/GeoLiteCity.dat"
            }
    
            geoip {
                source => "dstip"
                target => "dst_geoip"
                database => "/etc/logstash/GeoLiteCity.dat"
            }
        }
    }
output {
    if [parser] == "fortigate" {
        elasticsearch {
            index => [ "logstash-%{+YYYY.MM.DD}" ]
        }
      }
}

Fortimail.conf

input {
    syslog {
        type => "syslog"
        port => 1515
        add_field => ["parser", "fortimail"]
    }
}

filter {

    if [parser] == "fortimail" {

        grok { match => [ "message", "<(?<ruleID>.*)>(?<msg>.*)" ] }
        kv { source => "msg" }

        grok { match => [ "msg", "client_name=\".*\[(?<srcip>.*?)\].*?\"" ] }

        mutate { rename => { "dst_ip" => "dstip" } }

        geoip {
            source => "srcip"
            target => "src_geoip"
            database => "/etc/logstash/GeoLiteCity.dat"
        }

        geoip {
            source => "dstip"
            target => "dst_geoip"
            database => "/etc/logstash/GeoLiteCity.dat"
        }
    }
}

output {
    if [parser] == "fortimail" {
        elasticsearch {
            index => [ "logstash-%{+YYYY.MM.DD}" ]
       }

      }
}

Could anybody please help shed some light on this? I've spent the best part of 3 days on this and its driving crazy :frowning:


(Mark Walkom) #2

What's in your ES logs?


(Kelv1n) #3

Hi Mark

Nothing that indicates an error, just what I expect to see on startup, the whole log file is -

[2015-10-31 23:01:33,066][INFO ][node                     ] [Optoman] version[2.0.0], pid[1223], build[de54438/2015-10-22T08:09:48Z]
[2015-10-31 23:01:33,066][INFO ][node                     ] [Optoman] initializing ...
[2015-10-31 23:01:33,134][INFO ][plugins                  ] [Optoman] loaded [], sites []
[2015-10-31 23:01:33,509][INFO ][env                      ] [Optoman] using [1] data paths, mounts [[/var (/dev/sdb1)]], net usable_space [186gb], net total_space [196.7gb], spins? [possibly], types [ext4]
[2015-10-31 23:01:36,147][INFO ][node                     ] [Optoman] initialized
[2015-10-31 23:01:36,147][INFO ][node                     ] [Optoman] starting ...
[2015-10-31 23:01:36,464][INFO ][transport                ] [Optoman] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}, {[::1]:9300}
[2015-10-31 23:01:36,474][INFO ][discovery                ] [Optoman] elasticsearch/ZXw-YR1rR62k5UuWAFezRw
[2015-10-31 23:01:39,515][INFO ][cluster.service          ] [Optoman] new_master {Optoman}{ZXw-YR1rR62k5UuWAFezRw}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2015-10-31 23:01:39,613][INFO ][http                     ] [Optoman] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}, {[::1]:9200}
[2015-10-31 23:01:39,614][INFO ][node                     ] [Optoman] started
[2015-10-31 23:01:39,674][INFO ][gateway                  ] [Optoman] recovered [2] indices into cluster_state
[2015-10-31 23:03:45,707][INFO ][cluster.metadata         ] [Optoman] [logstash-2015.10.304] update_mapping [utm]
[2015-10-31 23:03:59,944][INFO ][cluster.metadata         ] [Optoman] [logstash-2015.10.304] update_mapping [utm]
[2015-10-31 23:05:35,078][INFO ][cluster.metadata         ] [Optoman] [logstash-2015.10.304] update_mapping [event]

Whilst the Logstash log is now a couple of GB's, with just those errors.


(Mark Walkom) #4

It's saying there is a parse error somewhere, so I'd expect to see something in the ES logs to match it. So that's odd.

Can you try adding output to a file and then match an event with an error time to see if there is anything?


(Kelv1n) #5

Yep, I'll give that a try.


(Kelv1n) #6

Thanks again - I found a matching entry. By the looks of it the problem the problem was caused by the

geoip {
    source => "dstip"
    target => "dst_geoip"
    database => "/etc/logstash/GeoLiteCity.dat"
}

If the Fortimail.conf. Can only guess this is periodically getting a value (or non-value) that it can't parse.


(Reuven G) #7

I seem to be having an identical problem. Did you happen to resolve your issues?


(Mark Walkom) #8

Please start your own thread :slight_smile:


(system) #9