Hi Magnus,
I have an ELK stack deployed in production. Its just a single node
Log files from various servers (webserver and application server) are pumped to Elastic search through logstash. Suddenly with doubled traffic with the website logstash is struggling to function as the log data becomes double in size.
We are only witnessing webserver log being shipped to ES through logstash . Many other log files are not getting shipped. Logstash is giving following message .
message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection."
CircuitBreaker::Open", :name=>"Lumberjack input", :level=>:warn}
{:timestamp=>"2016-02-15T16:21:06.342000+0000", :message=>"Exception in lumberjack input thread", :exception=>#<LogStash::CircuitBreaker::OpenBreaker: for Lumberjack input>, :level=>:error}
{:timestamp=>"2016-02-15T16:21:06.321000+0000", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Lumberjack input", :exception=>LogStash::SizedQueueTimeout::TimeoutError, :level=>:warn}
I am using Elasticsearch 1.5.1
Logstash 1.5.2
Could you please guide what should i do to avoid the issue and log file data be shipped uninterruptedly. Do i need another instance of logstash. Do i need another instance of elastic search and cluster it.
Any guidance please.
Are there any log messages related to Elasticsearch in the Logstash log? Anything in the Elasticsearch logs? The problem could very well be that you've reached the capacity of your current one-node cluster and that you have to expand vertically (faster hardware) or horizontally (additional nodes).
So you mean i have to have another instance of logstash and another instance of elasticsearch to scale horizontally?
or two node elasticsearch and single logstash?
which one is a better option.
Following is my sizing.
The daily shipped log is around 12 GB.
Have an ELK instance with 2 core cpu and 4GB memory for Elasticsearch.
If you can guide on this.
There is no extra message in logstash.log regarding ES. There is no exception in ES log also.
But during this period following is the repeated message from logstash-forwarder.log
Registrar: processing 1024 events
2016/02/15 20:52:59.247301 Read error looking for ack: read tcp 10.68.6.238:9300: i/o timeout
2016/02/15 20:52:59.247408 Setting trusted CA from file: /opt/JBoss/1dc.pki/logstash-forwarder.crt
2016/02/15 20:52:59.249424 Connecting to [10.68.6.238]:9300 (r4pvap1030.1dc.com)
2016/02/15 20:52:59.327281 Connected to 10.68.6.238
Registrar: processing 1024 events
Registrar: processing 1024 events
2016/02/15 20:52:59.247301 Read error looking for ack: read tcp 10.68.6.238:9300: i/o timeout
2016/02/15 20:52:59.247408 Setting trusted CA from file: /opt/JBoss/1dc.pki/logstash-forwarder.crt
2016/02/15 20:52:59.249424 Connecting to [10.68.6.238]:9300 (r4pvap1030.1dc.com)
2016/02/15 20:52:59.327281 Connected to 10.68.6.238
Please suggest.
So you mean i have to have another instance of logstash and another instance of elasticsearch to scale horizontally?
or two node elasticsearch and single logstash?
which one is a better option.
That depends on where the bottleneck is.
Have an ELK instance with 2 core cpu and 4GB memory for Elasticsearch.
This is pretty small machine for a 12 GB/day load. You should use e.g. Marvel to inspect how ES is doing. For example, I'd expect the heap pressure to be pretty high which will lead to slowdown caused by frequent and/or long garbage collections.
Hi Magnus,
Need an input from you regarding cluster configuration.
I have an ES and Logstash instance running in prod.
The server configuration is single core with 16 GB RAM.
The log volume is increased and hence i have added another server of similar configuration.
Need your advice on cluster configuration.
My view point: Have another instance of logstash and elastic search in the added server.
New instance of Elastic search will be a data node only. There is no master node in the cluster.
Part of the log will be shipped to new instance of logstash and will be put in the new node.
Kibana will have no changes and will be as such.
Firewall port needs to be opened for 9200 between these server for inter node communication for ES.
Need your advice.
Please suggest.
There is no master node in the cluster.
All clusters have exactly one master node at any given time. Clusters with more than one node can have more than one master-eligible node, i.e. nodes that can become masters.
Part of the log will be shipped to new instance of logstash and will be put in the new node.
Just because server A receives the indexing request doesn't mean that the data will be stored there. If the shard of the index where the document is to be stored resides on server B then that's where the document will go.
Firewall port needs to be opened for 9200 between these server for inter node communication for ES.
ES normally uses port 9300 to inter-node communication. Port 9200 is for HTTP.
My input config is
input{
lumberjack {
port => 9300
ssl_certificate => "/opt/JBoss/1dc.pki/logstash-forwarder.crt"
ssl_key => "/opt/JBoss/1dc.pki/logstash-forwarder.key"
}
}
and output is
elasticsearch{
host => "localhost"
port => "9200"
action => "index"
index => "fomonitor"
}
It works fine.
Do you think my output port shud have been diff?
Using HTTP for Logstash to ES communication is fine and is the default starting with Logstash 2.0.
Using port 9300 for Lumberjack isn't a very good choice since that would collide with ES if you would run that on the same network interface.
basically in a cluster(horizontal diff instance of ES in diff machine) 9300 should not be used as lumberjack port as it might collide with internode ES communication.
Is my understanding correct
Yes, that's right. Use different port numbers for different services.