Ingest Lag or Pipeline not working?

Hi guys,
I have a doubt my pipelines.

I have 2 pipelines to see (in Kibana) the most recent requests within a platform. Each request is a json that has a couple of fields and:

  • req-id
  • req-timestamp

My pipelines:

  1. The first pipeline (pipeline_temp) populates the "temp" index which receives all the documents in realtime;

  2. The second pipeline (pipeline_main) populates the "main" index; it is scheduled to start every 10 minutes, it has input the temp index and for each document it checks that there is not a document with the same req-id with a greater req-timestamp and empties the "temp" index.

In my "main" index I currently have about 12 million documents and I see that in the main index there are also req-ids with different req-timestamps (not just the most recent).

The loading of these documents seems to be random, the pipeline seems to work correctly 80% of the time but about 20% fails.

Could it be a data ingestion delay problem? Maybe the main pipeline checks if there are req-id with more recent req-timestamps, but if the document has not already been ingested the check fails

Thanks in advance
Ely

Why fails? Timeout?
Is req-id unique value?
12 mil records in total in main index?
How many docs usually has temp index?
Are you using ILM for temp?

Hi Rios,
Thanks a lot for your answer.

I dont know why fails... this is the purpose of my topic :slight_smile:
Yes, req-id is unique value; main index increase every 10 minutes, when the pipeline main runs.. but now I have 12 mil docs.

And no, I'm not using ILM (elasticsearch use the default value).

Thanks!!

Is there any error in /var/log/logstash/logstash-plain.log?
With temp index, you try to avoid duplicated records based on unique req-ids?
If req-ids=12345 and req-timestamps='10092022' in index, and temp index get req-ids=12345 and newer req-timestamps='11092022' , will be update of full record for req-ids=12345 or just req-timestamps in main index?

Hi Rios,

I just asked to have access to that lo ... let's see as soon as I obtain it. What could it contain?

In the temp index I load (and gradually empty) everything that arrives. I clean by timestamps only occurs in the main index

Search for error or timeout.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.