I'm importing two csv files into two different indexes in elastic.
One of them is business data and the other are config parameters.
When importing the business data csv in elastic, I need that index to have added a new field dynamically which is a number calculated between some of its data with values that are in the other index (the config params one).
In addition to that, the config params index can change its values from time to time (gets overwriten) and I need the business data index to get updated (retroactively) when the parameters index get updated.
So I thought I could do "re-importing data" from elastic to elastic to be re-processed in logstash frequently, or may I use scripted fields on kibana for that…
but i'm still too new and I wonder what's the best way to approach this neither if that's possible at all.
I'd also like to know your opinion about whether it makes sense to do this on logstash/kibana or it should be done somehow before the data get inserted.
Can the Elasticsearch input plugin help on this situation? like importing through a query that contains data from different indexes and using the result in the output?
I'd define the document _id to be a concatenation/hash of a few unique but stable values, and then when you get updated values you can just recreate that hash and it'll update the original document with the changes.
The _id for a document in Elasticsearch is the unique identifier. You can let Elasticsearch define that automatically or you can create your own.
What I am suggesting is that you take a 1/2/3 unique, but static, parts of each piece of of the business data points and then join/hash them as the _id. Then, when you need to update that data because one of the other values changes, you can simply use the same _id and it will update the existing document instead of creating a new one.
I think i got you, you mean to use "id's" in such a way I can later on reference to a "related" document to modify it by "knowing" him through its id, or at least a part of it, right?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.