Truncate/delete the entire index once new data arrives from Logstash

saif3r · May 15, 2019, 11:21am

Hello Guys,

We are preparing a pipeline, responsible for getting the 'current state' of data in our source table.

What it means is that the source table contains let's say 10 records. We would like to reflect those 10 records in Elasticsearch using Logstash to pull them every 10 minutes. During the day, the number of names might change, when for example someone will be deleted. So once the Logstash will run and will pull those 9 records, we would like to have it reflected in Elasticsearch with an index with 9 documents. We don't need old documents, as we want to see the 'current state'. We've been thinking about a mechanism that will truncate/delete index before new data will be pushed, but I'm not sure how we could achieve that using only Logstash and Elasticsearch and making sure that the data will always be present in an index.
Is that achievable in an automatic manner?

saif3r · May 16, 2019, 9:31pm

Hi Guys,

Anything that comes to your mind? I know that for updating existing records, we can use upsert, but how about deleting documents that no longer exist in source table? Is that possible to do using logstash?

dadoonet · May 16, 2019, 9:49pm

Read this and specifically the "Also be patient" part.

It's fine to answer on your own thread after 2 or 3 days (not including weekends) if you don't have an answer.

We are not all guys fortunately. I think that Hi! is perfectly enough

Not really. You can consider different options:

Use a technical temporary table of deleted items. Read that table and delete every document which is referenced in it.
Use a trigger
Modify the application layer (the service layer) and do that in real time. That's my preferred way.

I shared most of my thoughts there: https://david.pilato.fr/blog/2015-05-09-advanced-search-for-your-legacy-application/

saif3r · May 27, 2019, 8:28am

Hello @dadoonet

Noted. Appreciate it. The rush comes from the fact that we are currently doing PoC with ELK as a possible data platform, so the more information we get, the faster we can get to the step of evaluation, so apologies for that.

As for the proposal, unfortunately the source database cannot be changed in any way. We need to rely on it with current form.
I'm not yet familiar with triggers, so I'll sing my teeth into the documentation.

system · June 24, 2019, 8:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch Index Field Deletion Issue with Logstash Logstash elastic-stack-monitoring	0	21	September 26, 2024
Logstash to delete documents in a particular index each time it is run Logstash	4	463	May 16, 2018
How to drop/create index in Logstash config? Logstash	2	135	May 15, 2024
Delete and Re-Create Index with ES output plugin Logstash	6	562	August 29, 2019
Want to delete documents from elasticsearch which are deleted from database Elasticsearch	4	1156	January 28, 2020

Truncate/delete the entire index once new data arrives from Logstash

Related topics