Logstash's Elasticsearch filter plugin bottleneck

lethargicsnailspider · December 31, 2024, 2:02pm

I have a Logstash pipeline set up with multiple Elasticsearch filter plugins to do a lookup of field values ingested via an input filter. Processing and writing the events into Elasticsearch is around 500,000 events per hour. I added a few more Elasticsearch filter plugins and, to be fair, the queries are just a little bit more complex than the previous ones so I'm expecting a slight hit in terms of performance. However, the events written to Elasticsearch is currently less than 100,000 events per hour.

I checked the CPU usage of Elasticsearch nodes and they do need exceed 30%. I know that these new Elasticsearch filters set up are bottlenecking the flow. How can I check further to verify this? Perhaps getting some stats to show the bottleneck?

To add on, what might be some of the configurations I can look into to potentially increase the search performance? On Logstash's side I have configured persistent queueing but due to the bottleneck, I'm assuming events are getting dropped.

TIA

Badger · December 31, 2024, 4:33pm

I suggest using the pipeline stats API to look at the time spent in each plugin (input, filter, and output). If most of the cost of your pipeline is in the elasticsearch filter and you add a second one then that could double the cost of the pipeline and halve the throughput.

It will be easier to interpret the output if you set the id option on each filter plugin. Otherwise the ids will be randomly generated.

elasticforme · December 31, 2024, 8:50pm

in past I had issue of logstash limiting on write to elastic turn out I had open file issue and some kernal parameter in my linux host. I write around 420,000 event per hour without issue.

one of them was in start up service file, increase that

LimitNOFILE=

another kernel parameter is fs.file-max

look for some shared memory segment in sysctl.conf file. increase them if you can.

also check /etc/security/limits.conf file and increase open file limit for logstash

lethargicsnailspider · January 1, 2025, 2:45pm

Thanks for the suggestion. I was able to get some stats for the filter plugin.

Below shows the stats for an existing filter that did not encounter any slowdowns:

{
	"id": "es_filter_fp_port",
	"name": "elasticsearch",
	"events": {
	  "out": 66702,
	  "duration_in_millis": 97493,
	  "in": 66702
	},
	"flow": {
	  "worker_millis_per_event": {
		"current": 1.386,
		"last_1_minute": 1.432,
		"last_5_minutes": 1.392,
		"last_15_minutes": 1.4,
		"lifetime": 1.462
	  },
	  "worker_utilization": {
		"current": 0.07578,
		"last_1_minute": 0.1105,
		"last_5_minutes": 0.1369,
		"last_15_minutes": 0.1384,
		"lifetime": 0.1983
	  }
	}
  }

lethargicsnailspider · January 1, 2025, 2:46pm

Below shows the stats for the new filter:

  {
	"id": "es_filter_flow_dest_geo",
	"name": "elasticsearch",
	"events": {
	  "out": 173832,
	  "duration_in_millis": 29832992,
	  "in": 176230
	},
	"flow": {
	  "worker_millis_per_event": {
		"current": 176.8,
		"last_1_minute": 181,
		"last_5_minutes": 172,
		"last_15_minutes": 171.5,
		"lifetime": 169.3
	  },
	  "worker_utilization": {
		"current": 71.63,
		"last_1_minute": 69.51,
		"last_5_minutes": 73.35,
		"last_15_minutes": 72.74,
		"lifetime": 60.69
	  }
	}
  }

As seen above, the millis per event values were higher as well as the worker utilization. Are there optimizations i can do on Logstash's side? Perhaps increasing the number of workers? Or this is purely due to Elasticsearch's search performance?

lethargicsnailspider · January 1, 2025, 2:48pm

Thanks for the suggestion. I will try this out but most likely it might not fix the current issue as I found out the main culprit is ELasticsearch search performance

Badger · January 1, 2025, 4:23pm

Please do not post pictures of text, they are not searchable and are not readable for some people. Just post the text.

RainTown · January 1, 2025, 7:15pm

Er, you may also want to revisit that claim/assumption, double check the filters!? Back of envelope was 100x more milliseconds spent as per your screenshots.

strawgate · January 2, 2025, 5:48pm

Can you share your old and new queries?

lethargicsnailspider · January 3, 2025, 5:53am

Please do not post pictures of text, they are not searchable and are not readable for some people. Just post the text.

Got it. I replaced the images with text

lethargicsnailspider · January 3, 2025, 5:57am

Can you share your old and new queries?

Old query:

{
  "size": 1,
  "sort": [ {"insert_time" : "desc"} ],
  "query": {
    "term": {
      "port.keyword": "%{[destination_port_raw]}"
    } 
  }
}

New query:

{
  "size": 1,
  "sort": [ {"geo_lower_decimal": "desc"} ],
  "query": {
    "range": {
      "geo_lower_decimal": {
        "lte": "%{[destinationAddress_dec]}"
      }
    }
  }
}

RainTown · January 8, 2025, 4:34pm

Those queries are significantly different.

Topic		Replies	Views
Logstash plugin elasticsearch performance issue Logstash	8	1439	July 10, 2017
Logstash-Filter-Elasticsearch Slow Logstash	6	1476	January 19, 2018
Speed up the pipeline-2 Logstash	5	2383	February 15, 2018
How to improve performance of elasticsearch input in Logstash Logstash	4	1521	July 6, 2017
Bottleneck while inputting data into the elasticsearch Logstash	7	3328	December 29, 2016

Logstash's Elasticsearch filter plugin bottleneck

Related topics