I have several huge files (hundreds of GBs) that I want to "double parse" with logstash.
I have one file which maps user IDs to user emails, and then another huge file which maps user IDs to their actions.
So in my first parse (which is very normal), I'll index the user ID and action. In the second parse, I'll make use of logstash's output action => "update"
to update it with the email address.
However, I'm not very clear of how Elastic is doing this updating. Does it delete (hide) the old data and make a new copy? If so, does the old data ever get deleted? Am I doing something completely wrong here?
Thanks!