Add new field and updating index by changes in another index

akapit · June 26, 2018, 6:16pm

Hi,

I'm importing two csv files into two different indexes in elastic.

One of them is business data and the other are config parameters.

When importing the business data csv in elastic, I need that index to have added a new field dynamically which is a number calculated between some of its data with values that are in the other index (the config params one).

In addition to that, the config params index can change its values from time to time (gets overwriten) and I need the business data index to get updated (retroactively) when the parameters index get updated.

So I thought I could do "re-importing data" from elastic to elastic to be re-processed in logstash frequently, or may I use scripted fields on kibana for that…

but i'm still too new and I wonder what's the best way to approach this neither if that's possible at all.

I'd also like to know your opinion about whether it makes sense to do this on logstash/kibana or it should be done somehow before the data get inserted.

Can the Elasticsearch input plugin help on this situation? like importing through a query that contains data from different indexes and using the result in the output?

I'd appreciate any help

Thanks a lot!

warkolm · June 26, 2018, 11:23pm

What sort of data is it?

akapit · June 27, 2018, 7:03am

It's a csv import, plan text fields and some numeric fields.

warkolm · June 27, 2018, 10:41pm

I'd define the document _id to be a concatenation/hash of a few unique but stable values, and then when you get updated values you can just recreate that hash and it'll update the original document with the changes.

akapit · June 28, 2018, 8:11am

I didn't actually understood that, could you elaborate that a bit more?
Or there is anything I can read on that topic?

Thanks

warkolm · June 28, 2018, 9:59am

No worries!

The _id for a document in Elasticsearch is the unique identifier. You can let Elasticsearch define that automatically or you can create your own.

What I am suggesting is that you take a 1/2/3 unique, but static, parts of each piece of of the business data points and then join/hash them as the _id. Then, when you need to update that data because one of the other values changes, you can simply use the same _id and it will update the existing document instead of creating a new one.

akapit · July 8, 2018, 9:33am

I think i got you, you mean to use "id's" in such a way I can later on reference to a "related" document to modify it by "knowing" him through its id, or at least a part of it, right?

warkolm · July 8, 2018, 9:34am

Yep!

akapit · July 8, 2018, 9:34am

Sounds cool!
The question now is how do you "search" that very document? let's say I have one of the fields I used to hash its id.

Btw I thought there were an automated way of doing that, but here the approach is manual, am I right?

warkolm · July 8, 2018, 9:55am

You can define an _id in the Elasticsearch output - https://www.elastic.co/guide/en/logstash/6.3/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-document_id - and In that you can then use field references. Which means you can build your _id using the values of fields.

system · August 5, 2018, 9:55am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Importing file to an existing index Logstash	5	996	February 8, 2022
Adding data to existing index for which dashboards are created Logstash	6	2442	February 14, 2019
How to update a index based on fields other than the document_id using logstash? Logstash	7	2983	July 20, 2018
Sending new fields to exisiting Elastic document, losing old data Logstash	2	450	December 5, 2019
Logstash is indexing the last line of my csv file in elasticsearch Logstash	3	1250	July 6, 2017

Add new field and updating index by changes in another index

Related topics