I've been running an ELK stack on a single node for the past year to ingest our application server logs. The load is consistently quite high on this bare-metal box. I was recently given a smaller baremetal server to install an additional instance of logstash on in an attempt to reduce the load on the single node.
I set up logstash with the exact same config on the new server and configured filebeat on the application servers just to send logs to the new box(with just logstash on it). Upon restarting filebeat to use these instances, I noticed that elasticsearch was only ingesting about 10% of the messages it had before. I have no idea where the other 90% went. I thought that perhaps the new box was too slow so maybe the messages were being queued. I then configured filebeat on all the application servers to use load balancing. I encountered the same result as above.
Can someone point me in the right direction to accurately diagnose and fix this?
The single node it's running on right now is a 14-core system (1.7 ghz with 128 gb's memory) and the load at peak usually hovers around 40 or so. This is when logstash is running one instance on the 14-core system. However, when I enable logstash on the 'slower' secondary server, my load on my 14-core system goes down to 8 or 9...I haven't used the logstash api yet (still on 5.6.5) but when I tracking system network performance, I have a steady stream of logs coming in to the 14-core system when that is the only instance of logstash running. However, when enabled on the secondary, I see traffic spikes where it looks like it sends a 'spike' of data every ten minutes and then backs off....
Is there a way to monitor/graph the logstash api without x-pack?
Good idea. I setup a poller for http://localhost:9600/_node/stats/pipeline. However, when I look it
in Kibana, the fields I'm interested in (input,filter,output) metrics are not created as fields because they're nested. I tried using a 'mutate/rename' filter to flatten those fields but no luck. I'm just interested in graphing the values of my logstash filters over the long-term with Kibana so is there an easier to way to do this without x-pack and using the polling method above?
The problem is everything is an array. Kibana will let you visualize something like Average pipelines.<name>.plugins.inputs.events.out, but you really want to be able to look at individual filters.
But then I end up with nearly 4000 fields in the index, and handling fields like Average filter-0088b8fd258ba718ebf4af6005e661f8883f37cdc133f003965ed07c0f7dc8f2.events.duration_in_millis is a bit awkward.
The approach might work if you only needed to do it for a small number of filters, especially if you set all the filter ids in your config so that you get names like Average filter-ihs-geoip.events.duration_in_millis.
What version of logstash are you using? I'm using logstash 5.6.5. I get an invalid url with one you specified but when I change it to pipeline, I get the "Ruby exception occurred: undefined method `each' for nil:NilClass" so I suspect that you're using a different version and they changed the api ever so slightly.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.