I want to create an Elastic Search database from a few hundred thousands lines of data in SQL. What are the reasons against recreating the database every night as opposed to updating it every night. I'm guessing recreating won't take longer than 20mins. But I haven't looked at costs yet. I was thinking of archiving the previous night.
I would probably index in another index, like data-YYYY-MM-DD and use aliases.
When the new indexation process is over, switch the alias and no one will notice.
Then if you don't really need the old index being available for search, but only for backup in case something bad is happening with the new index, I'd probably close the index so it will use only disk resources.
My 2 cents
I think you're suggesting that it's possible to recreate the database each night rather than updating a few rows?
Would it cost more to recreate than to update?
Yes it's possible.
It basically depends on the number of updates you are expecting per day. If it's a reasonable amount, then updating existing docs is fine. Replacing 90% of the docs does not make sense though.
Yes it probably will. You are reindexing the entire dataset again.
They do not exist in Elasticsearch. Do you mean index perhaps?