Split brain case?

Hi there! I has 2 nodes cluster working well... But when I create a third node to prevent the split brain the logstash stoped work...

I have 3 nodes:

Belerian (8GB), Eglarest(4GB) and Gondolin(4GB). Each one is a individual VM in the same network.

Gondolin:

cluster.name: elasticsearch
node.name: "gondolin"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["belerian.cabib.local"]
discovery.zen.minimum_master_nodes: 2

Eglerast:

cluster.name: elasticsearch
node.name: "eglarest"
node.master: true
node.data: false
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["belerian.cabib.local", "gondolin.cabib.local"]
discovery.zen.minimum_master_nodes: 2

Belerian:

cluster.name: elasticsearch
node.name: "belerian"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["gondolin.cabib.local"]
discovery.zen.minimum_master_nodes: 2

After some changes, the logstash stop works with no errors...

Is this a case of split brain? Have solution? Or I must remove all the indices and start again?

Al indices are in green state and Kibana work well...

Thanks so much!

/var/log/logstash/logstash.err

Oct 28, 2015 4:56:54 AM org.elasticsearch.node.internal.InternalNode <init>
INFO: [logstash-belerian-14539-11646] version[1.7.0], pid[14539], build[929b973/2015-07-16T14:31:07Z]
Oct 28, 2015 4:56:54 AM org.elasticsearch.node.internal.InternalNode <init>
INFO: [logstash-belerian-14539-11646] initializing ...
Oct 28, 2015 4:56:55 AM org.elasticsearch.plugins.PluginsService <init>
INFO: [logstash-belerian-14539-11646] loaded [], sites []
Oct 28, 2015 4:56:58 AM org.elasticsearch.bootstrap.Natives <clinit>
WARNING: JNA not found. native methods will be disabled.
Oct 28, 2015 4:56:58 AM org.elasticsearch.node.internal.InternalNode <init>
INFO: [logstash-belerian-14539-11646] initialized
Oct 28, 2015 4:56:58 AM org.elasticsearch.node.internal.InternalNode start
INFO: [logstash-belerian-14539-11646] starting ...
Oct 28, 2015 4:56:58 AM org.elasticsearch.transport.TransportService doStart
INFO: [logstash-belerian-14539-11646] bound_address {inet[/0:0:0:0:0:0:0:0:9301]}, publish_address {inet[/10.73.150.31:9301]}
Oct 28, 2015 4:56:58 AM org.elasticsearch.discovery.DiscoveryService doStart
INFO: [logstash-belerian-14539-11646] elasticsearch/CsRiVGT2TEKh3WkxH3jrWw
Oct 28, 2015 4:57:28 AM org.elasticsearch.discovery.DiscoveryService waitForInitialState
WARNING: [logstash-belerian-14539-11646] waited for 30s and no initial state was set by the discovery
Oct 28, 2015 4:57:28 AM org.elasticsearch.node.internal.InternalNode start
INFO: [logstash-belerian-14539-11646] started

/var/log/logstash.log

{:timestamp=>"2015-10-28T04:56:52.662000-0500", :message=>"You are using a deprecated config setting \"singles\" set in grok. Deprecated settings will continue to work, but are scheduled for removal from logstash in the future. This behavior is the default now, you don't need to set it. If you have any questions about this, please visit the #logstash channel on freenode irc.", :name=>"singles", :plugin=><LogStash::Filters::Grok match=>["message", "%{SYSLOGBASE} %{GREEDYDATA:_syslog_payload}"], singles=>"true", add_tag=>["postfix", "email", "mail01", "smtp"]>, :level=>:warn}
{:timestamp=>"2015-10-28T04:56:52.671000-0500", :message=>"You are using a deprecated config setting \"singles\" set in grok. Deprecated settings will continue to work, but are scheduled for removal from logstash in the future. This behavior is the default now, you don't need to set it. If you have any questions about this, please visit the #logstash channel on freenode irc.", :name=>"singles", :plugin=><LogStash::Filters::Grok singles=>"true", match=>["_syslog_payload", "%{BASE16NUM:queue_id}: %{GREEDYDATA:details}"]>, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:00.730000-0500", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Lumberjack input", :exception=>LogStash::SizedQueueTimeout::TimeoutError, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:00.733000-0500", :message=>"Lumberjack input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::HalfOpenBreaker, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:05.731000-0500", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Lumberjack input", :exception=>LogStash::SizedQueueTimeout::TimeoutError, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:05.733000-0500", :message=>"Lumberjack input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::HalfOpenBreaker, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:10.733000-0500", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Lumberjack input", :exception=>LogStash::SizedQueueTimeout::TimeoutError, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:10.735000-0500", :message=>"Lumberjack input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::HalfOpenBreaker, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:12.384000-0500", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Lumberjack input", :exception=>LogStash::SizedQueueTimeout::TimeoutError, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:12.387000-0500", :message=>"Lumberjack input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::HalfOpenBreaker, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:15.821000-0500", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Lumberjack input", :exception=>LogStash::SizedQueueTimeout::TimeoutError, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:15.823000-0500", :message=>"Lumberjack input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::HalfOpenBreaker, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:15.860000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2015-10-28T04:57:15.911000-0500", :message=>"CircuitBreaker::Open", :name=>"Lumberjack input", :level=>:warn}
{:timestamp=>"2015-10-28T04:57:15.912000-0500", :message=>"Lumberjack input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::CircuitBreaker::OpenBreaker, :level=>:warn}
{:timestamp=>"2015-10-28T04:57:16.361000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2015-10-28T04:57:16.863000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2015-10-28T04:57:17.364000-0500", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}

What does your elasticsearch output configuration look like? Since you've disabled multicast discovery on the ES side I hope you're pointing to one or more ES servers in the LS configuration.

Hi Magnus, this is my output configuration:

output {
elasticsearch {
host => localhost
cluster => elasticsearch
}
stdout { codec => rubydebug }
}

I change the config to:

output {
  elasticsearch {
        host => ["localhost", "gondolin.cabib.local"]
        cluster => elasticsearch
  }
  stdout { codec => rubydebug }
}

But the same behavior... The stdout log start showing information and freeze....

WARNING: [logstash-belerian-15326-11646] waited for 30s and no initial state was set by the discovery

I dont know why, or how... But reading this (https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html) ocurred to me that changing the protocol may help... So I put the http protocol and now work like a charm...

Here is my final config:

output {
  elasticsearch { 
        host => ["localhost", "gondolin.cabib.local"]
        cluster => elasticsearch
        protocol=> http
  }
  stdout { codec => rubydebug }
}

Yes, switching to HTTP tends to solve most connectivity problems. That's why it's the default in Logstash 2.0.

1 Like

I'm using 1.7.3 version. Do you recomend the upgrade to 2.0 or is'nt stable?

1.7.3 is the ES version. Logstash 2.0 and Elasticsearch 2.0 were released today.