Best Practice for building new version of an index and replacing existing using LogStash

Airn5475 · December 22, 2016, 4:04pm

I am pulling entities out of a database and serializing them to .json using a C# console app.
The .json files are then read by Filebeat which sends the data to LogStash, which pushes into Elasticsearch.

Every 5 minutes I will query the database for entities that have changed.
Once a day (or week) I would like to completely rebuild the entire index, just to make sure we haven't missed anything.

I understand the concept of aliases and intend to use one for this.

How should I configure Logstash to push simple changes to the existing index, but push the complete rebuild to the next version?
I thought about putting the day (or week-depending on the rebuild threshold) timestamp in the index name of the Logstash Filebeat-to-Elasticsearch config, but this would simply create a new version, I wouldn't know when to swap out old for new.

I could have my console app handle this, but I'm not sure how it would know that Logstash was done pushing files to the new index...

Also, it would be nice and maybe even imperative to be able to queue up an entire rebuild on-demand. I am new to Elasticsearch and I fear I may need to be react quickly in the event I do something wrong.

Looking for some seasoned input. Maybe I'm thinking about this all wrong and there is a better route to go.

warkolm · December 27, 2016, 9:47am

You need to be careful of sql_last_value. Have a separate config file somewhere that LS doesn't auto-read, then run that via cron or something.

What does filebeat have to do with this though?

Airn5475 · December 27, 2016, 1:44pm

I am using a custom app to pull data from the database, not JDBC. Nonetheless, thank you for the warning about sql_last_value.

My apologies, when I said "filebeat config" I meant my LogStash config file that is pulling from Filebeat and pushing to Elasticsearch. (The file name is "filebeat.config")

I thought about having a separate LogStash config file for the "rebuild" job and run that separately, but I still wasn't sure how to coordinate the swap with the old index to the new one after the "rebuild" is complete.

warkolm · December 27, 2016, 7:27pm

Try https://www.elastic.co/blog/changing-mapping-with-zero-downtime

Airn5475 · December 27, 2016, 8:04pm

Yeah I read that article before and it provides some great thoughts. I already intend to use aliases, but those don't provide the solution I'm looking for.

When I create a new version of the index and LogStash pushes all of the documents into it, I need to know when it is complete in order to flip the alias to point at the new index. At which point I can then delete the old.

I could set a second job to run X minutes afterwards, but if the new index isn't ready to go...

I suppose I could get a count of the documents that I put in the Filebeat Input .json File and hit the new version of the index in Elasticsearch for a count. Once the count hits the expected number, I could flip the alias.

system · January 24, 2017, 8:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Create new indices on each Beats update Beats filebeat , auditbeat	1	313	June 28, 2022
What configuration is used to add newly added data in old indexes? Logstash	8	776	August 30, 2019
How to replace existing index after reindexing using Logstash Logstash	1	393	February 8, 2018
Logstash daily indexing? Logstash	4	238	August 13, 2019
Feed data into the newly created index Elasticsearch	8	440	May 26, 2020

Best Practice for building new version of an index and replacing existing using LogStash

Related topics