I am getting below error and I understand from other thread I am overloading elasticsearch. I am looking on what to dail down to get rid of this problem and get the most of performance.
I have 9 LS nodes.
My logstash info is:
Version: 5.0.2 (so is elasticsearch)
logstash.yml:
node.name: logstash2
path.data: /var/lib/logstash
pipeline.workers: 16
pipeline.output.workers: 16
pipeline.batch.size: 400
pipeline.batch.delay: 5
path.config: /etc/logstash/conf.d
http.host: "0.0.0.0"
http.port: 9600-9700
log.level: info
path.logs: /opt/logstash/logs
9 Logstash nodes? How many events per second for each? In aggregate? How big is your Elasticsearch cluster?
A 429 is not the end of the world, since Logstash will keep retrying data that got rejected. But a steady stream of them tends to mean that Elasticsearch can't keep up with the amount of data you're sending. Dialing down in such a case means dropping some data (just not sending it, filtering it out, etc), which may not be what you intend or want.
You may actually want to keep all of that data. If so, solutions include expanding your Elasticsearch cluster.
Those 9 LS node are legacy and transformed from kugh graylog and that needed 9. In the elasticsearch cluster I have 3 master nodes, 6 data only nodes (32g mem 16G for ES and 8 cpu's) and 3 coordinator nodes for heavy searches. I can expand the ES data nodes with 3 more nodes which I can take from logstash.
So the logstash config I currently have is not so weird then?
Not at all, if the need is there. I still don't know what kind of documents you're indexing, at what rate, and how many different mappings (or a high count of fields), or any other indicators which would help me with a ballpark guess at how busy your cluster might be. If your Logstash nodes aren't overburdened, then more ES data nodes can be a good thing for your ingest performance.
Also, I would look at setting
hosts => [ "host1", "host2", "host3" ]
in your Elasticsearch output, where hosts 1, 2, and 3 are the coordinator nodes, so that the indexing requests are load balanced (round-robin).
The math is not precise when doing this, but it appears that you are doing around 50,000 events per second. That's actually a pretty decent number. Are your ES nodes on spinning disks?
At any rate, spreading that load across a few more data nodes will help prevent 429s. It could also be that you get burst activity, which would also increase the likelihood of 429s. But if you take 3 nodes from Logstash away, will the others still be able to keep up? That becomes an issue. I don't know more of your filter pipeline, but we may be able to help you optimize it.
The complete cluster is on vmware with NETAPP san storage under it (I don't know the details) if push come to shuf I can spin more nodes without any issue... My complete filter is in the gist link I posted.
Talks for licenses and support are underway that will happen soon and until then I am relaying on IRC, discuss and google
Those are interesting Redis configuration blocks. Have you seen https://www.elastic.co/blog/just_enough_redis_for_logstash ? The huge batch count and thread count may be limiting performance. The aforementioned blog post may reveal some tweaks you can make.
Awesome use of the new dissect filter, by the way. If I did have to point to something suspicious, it would be any use of GREEDYDATA in grok. That's a very expensive regex to use, and you have it in there a few times. You might want to find if there's any other way to extract the data there.
You already get the [geoip][location] field by default, which is an array of [longitude, latitude]. If you need to have it be named coordinates you can always use a mutate filter to rename [geoip][location] to [geoip][coordinates]. That saves a few steps for you. You won't even have to convert it to a float.
...are effectively creating a redundant coordinates field. Chances are good that you have Kibana dashboards which are using coordinates, so you should work to preserve that standard, or slowly switch to using location. To convert/rename the existing location field to a coordinates field, you would use:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.