Hii I am using logstash to parse my logs . This is my architecture . Filebeat ships logs from server to Logstash. Logstash parses the logs and sends it to Elasticsearch . Elasticsearch's logs can be viewed using kibana . Now one of my client said that he can have about 500MB of logs generated in a day . So in order to test whether my ELK stack can handle this amount of traffic i wrote a bash script that is as shown below -
dt=$(date '+%d/%m/%Y %H:%M:%S')
echo $dt "local.ERROR:" $dt >> 'testing.log'
The contents of this testing.log file is getting shipped to the Logstash . But even when I stopped the script after 12 hrs or so logs were getting printed on the console . What does this means ??? further the time was old that means Logstash is somehow queuing the logs somewhere . Does Logstash queues logs ? If yes what can I do to remove these logs from the queue ????
Filebeat (and/or Logstash) probably can't keep up with your loop that does nothing except append data to a file. That's why Logstash keeps processing the messages even after you've shut down the shell script. The file itself effectively becomes Logstash's buffer/queue.
500 MB/day is about 6 kB/s if we naïvely assume a uniform distribution over the day. That's nothing and way less than what you produced with your shell script.
@magnusbaeck So how to stop logstash from parsing the logs of the file ???
Besides shutting down Logstash/Filebeat? Well, deleting the file or removing it from the configuration should do it.
@magunsbaeck Shutting down logstash and restarting it does not do the desired thing but I believe deleting the file will sure do it .
@magunsbaeck One more thing when I am monitoring the node with ELK installed on it and this bash script running I can see about 100% CPU usage and the RAM use was increasing continuously , so upto what value it will go ?Earlier I was pushing my logs from logstash to elasticsearch alongwith printing on the console due to which RAM usage was continuously increasing , then i disallowed the printing of logs to console . Now it is not continuously increasing . To have the stats earlier it went upto 2600/3764 and now it is 1993/3764 . What is the worst case it can achieve ??
Shutting down logstash and restarting it does not do the desired thing but I believe deleting the file will sure do it .
Restarting won't help since they're designed to pick up from where they stopped.
What is the worst case it can achieve ??
Which process, Logstash or Elasticsearch?
@magnusbaeck worst case for RAM because of such heavy load. can it happen that ELK stack goes down after too much RAM is being used ?? Or extreme slowness gets there ?
Elasticsearch could definitely run out of heap space if you push too much data into it. There's no formula to decide how much heap space you need so it's something you have to figure out yourself and monitor.
@magnusbaeck Yes we are thinking of monitoring Elasticsearch using Datadog.
- Will this be a good idea ?? Can you suggest alternative if its not a good idea ?? 2.Further is it possible that because of that bash script I might not get data from other server ??? I am saying this because I could not see logs after 10:03:51 in elasticsearch but the data was produced on the server.
3.So in order to prevent this situation should I add a node ?? Right now I have my Logstash , Elasticsearch and Kibana installed on same machine. Will adding a node gurantee that I will not loose data from other server(s) ??
@magnusbaeck So by point number two you mean that I won't loose data but it will get delayed right ?? I will get the data after some time right ??
Local files act as a buffer for a limited time, yes.
@magnusbaeck So lets just suppose I add a node (at a different machine) to my existent setup (Elasticsearch , Logstash , Kibana on same machine) .
- So do I need to again make indices using CURL on the newly added node ??? Or Elasticsearch will itself do the thing ?? What I studied , it says it will automatically do the desired thing on its own .
- Further lets just say I need to add an index in future . So do i need to add it on the same old machine on which i had added indices so far (using curl -XPUT 'http://localhost:9200/index_name ')or i need to add the index to newly added node using CURL as curl -XPUT 'http://newly_added_node_ip:9200/index_name'
- Will my Kibana automatically get aware of newly added node and provide me the result of indices present on the newly added node(s) ???
- Yes, it'll take care of that for you.
- You can create indexes on any node.
Note that one should be careful when running two nodes as a split brain situation can occur. You should either have three nodes (and set discovery.zen.minimum_master_nodes to two) or just make one of the nodes master-eligible. There's a lot about running a cluster in the documentation; I recommend you read it.
And obviously, if Logstash is the bottleneck in your system it won't help to add an extra ES node.
@magnusbaeck Now I am getting the logs after a lot of delay because of this bash script probably .
1.Do you too think it is because of the bash script ??? This is because I haven't yet got the logs of 03/02/2016 . Logstash is still parsing the logs of 02/02/2016 .
2. So how can I scale Logstash now ??
3. One option is using more than one logstash . Is it a good idea ?? I don't think so because delay will be there too.
4. Will increasing the number of processors will scale our Logstash ??
@magnusbeack What can be the reason for increasing memory consumption ???
can it be because of -
- any kind of memory leak ?
- any logstash configuration issue ??
- or something else
How can I find it ??
@magnusbaeck I doubt Logstash is the problem with high probability because when I see it on the console i can see Logstash still parsing the logs of 02/02/2016 and today date is 04/02/2016 . So the problem is with Logstash for sure don't you think so ? So can there be a bottleneck on ES side ?? I mean ES can show logs only when it gets and if it is not getting the logs how can it show?? So how can ES be the bottleneck ??
I doubt Logstash is the problem with high probability because when I see it on the console i can see Logstash still parsing the logs of 02/02/2016 and today date is 04/02/2016 . So the problem is with Logstash for sure don't you think so ?
I do think Logstash is the problem, but you're not interpreting the evidence correctly. Your pipeline looks like this:
file on disk => Logstash => Elasticsearch
Regardless of whether the bottleneck is Logstash or ES the effect early on in the pipeline will be the same. If ES is too slow it'll push back against Logstash, and since Logstash doesn't have a buffering mechanism to speak of it'll scale back the log reading.
So can there be a bottleneck on ES side ?? I mean ES can show logs only when it gets and if it is not getting the logs how can it show?? So how can ES be the bottleneck ??
Yes, of course ES can be a bottleneck.