Add new field and updating index by changes in another index


(Alejandro Kapit) #1

Hi,

I'm importing two csv files into two different indexes in elastic.

One of them is business data and the other are config parameters.

When importing the business data csv in elastic, I need that index to have added a new field dynamically which is a number calculated between some of its data with values that are in the other index (the config params one).

In addition to that, the config params index can change its values from time to time (gets overwriten) and I need the business data index to get updated (retroactively) when the parameters index get updated.

So I thought I could do "re-importing data" from elastic to elastic to be re-processed in logstash frequently, or may I use scripted fields on kibana for that…

but i'm still too new and I wonder what's the best way to approach this neither if that's possible at all.

I'd also like to know your opinion about whether it makes sense to do this on logstash/kibana or it should be done somehow before the data get inserted.

Can the Elasticsearch input plugin help on this situation? like importing through a query that contains data from different indexes and using the result in the output?

I'd appreciate any help

Thanks a lot!


(Mark Walkom) #2

What sort of data is it?


(Alejandro Kapit) #3

It's a csv import, plan text fields and some numeric fields.


(Mark Walkom) #4

I'd define the document _id to be a concatenation/hash of a few unique but stable values, and then when you get updated values you can just recreate that hash and it'll update the original document with the changes.


(Alejandro Kapit) #5

I didn't actually understood that, could you elaborate that a bit more?
Or there is anything I can read on that topic?

Thanks


(Mark Walkom) #6

No worries!

The _id for a document in Elasticsearch is the unique identifier. You can let Elasticsearch define that automatically or you can create your own.

What I am suggesting is that you take a 1/2/3 unique, but static, parts of each piece of of the business data points and then join/hash them as the _id. Then, when you need to update that data because one of the other values changes, you can simply use the same _id and it will update the existing document instead of creating a new one.


(Alejandro Kapit) #7

I think i got you, you mean to use "id's" in such a way I can later on reference to a "related" document to modify it by "knowing" him through its id, or at least a part of it, right?


(Mark Walkom) #8

Yep!


(Alejandro Kapit) #9

Sounds cool!
The question now is how do you "search" that very document? let's say I have one of the fields I used to hash its id.

Btw I thought there were an automated way of doing that, but here the approach is manual, am I right?


(Mark Walkom) #10

You can define an _id in the Elasticsearch output - https://www.elastic.co/guide/en/logstash/6.3/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-document_id - and In that you can then use field references. Which means you can build your _id using the values of fields.


(system) #11

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.