2nd stage filtering

mortenb123 · November 29, 2018, 7:55am

Hi

I read log into elasticsearch using the _bulk endpoint and manage around 10Klines/sec. I have very modest regexp filtering just @timestamp, operationid and log, if I add more elaborative filtering the insert rate goes down. (the server can deliver 80K lines/sec)

Is there a way to refilter this index later on. I only have a count as index. creating uuids or a unique of log takes too long.

A perfect 1 stage parsing leaves me with 400 lines/sec

Christian_Dahlqvist · November 29, 2018, 7:57am

Are you using ingest node to parse the data? What is the size and specification of your cluster?

mortenb123 · November 30, 2018, 3:53pm

Hi

No ingest node, just prefiltering in python. it is a single node in docker with 64GB ram and 1T 4ssd striped. I will try and split it up and do it in parallell since insertion across network easily handles >10K lines/sec, but I'm cpu and memory bound on server. We tried rsyslog and do processing directly on the es server, but rsyslog only manages around 3000 lines/per sec. _bulk is much faster.

Some ideas of best practices would be nice.
Thanks

system · December 28, 2018, 4:06pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Improving fingerprint filter performance Logstash	5	634	May 24, 2021
Huge concurrent data ingestion to ElasticSearch Elasticsearch	16	2934	September 18, 2018
How to gain ingestion rate Elasticsearch	15	6029	July 5, 2017
Logstash pipeline using elasticsearch filter experiencing low performance Logstash	2	422	May 12, 2019
Logstash-Filter-Elasticsearch Slow Logstash	6	1499	January 19, 2018

2nd stage filtering

Related topics