ELK pipeline duration and performance

bianca6 · April 11, 2022, 11:34am

Goal

Hi, I have an ELK pipeline running and I would like to measure the duration of the different steps (Logstash grok and send - Elasticsearch indexing).

Those information are useful for me in order to check the scalability.

But, there are few things unclear.

Logstash duration & ressource consumption

There is an great API here:
Node Stats API | Logstash Reference [8.1] | Elastic

I have the numbers but I'm not sure about the field "@timestamp", it represents the "ingestion time", but from where? The ingestion into Logstash or Elasticsearch?

Indexing in Elasticsearch

Once Logstash is done, data are shipped to Elasticsearch, but how do I know how much time does it takes?

Thanks for the help, any insights is welcome!

leandrojmp · April 11, 2022, 12:17pm

The easiest way to get those numbers is to use the stack monitoing as this can show you how much time each part of your logstash pipeline is taking.

The value of the @timestamp field depends entirely on your pipelin, this field is normally used to get the event time and could be not related to processing time, for example, if you have a date filter applied on a field in your document and do not specify the target for this filter, logstash will store the parsed value in the @timestamp field.

If you are not using the date filter to change the value of the @timestamp field, then the value of this field will be the time when the event exited the input part of your pipeline.

bianca6 · April 11, 2022, 2:00pm

Thanks for the answer!

I'm using both, the ingest @timestamp and the one in the logs.

The stack monitoring has enough details your are right, there are also information directly on the index (tab details) I did not see it at the beginning (everything were at 0), maybe because I restarted the pipeline or those were old data...

Now I have another question, not exactly link to the topic:

I noticed (thanks to the stack monitoring) that the JVM size is around 16 GB (that's normal, 50% of my ram), but when the pipeline is processing new data (from logs files), the JVM Heap never use more than 9GB, with an average around 6GB.

It is normal / advice to keep it so low? Or should I configure it somewhere, to allow it to use more power?

system · May 9, 2022, 2:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Timestamp of When ElasticSearch Receives Data? Elasticsearch	3	673	May 1, 2020
Timing Questions with Logstash pipelines v. API Data Logstash ingest-pipeline	1	312	January 11, 2023
Measuring the taking time Logstash	5	3832	February 1, 2018
Logstash ingestion performance for log management Logstash	7	4605	March 9, 2020
How to add time of ingestion to the document? Elasticsearch	3	3150	August 27, 2020

ELK pipeline duration and performance

Goal

Logstash duration & ressource consumption

Indexing in Elasticsearch

Related topics