Adding Field from one index to another index / Merging two Index

Hello, I have 2 tables coming in as CouchDb changes in logstash, which get converted into two separate indexes.

Problem statement: I want to either merge the two indexes or add the fields in one of the indexes, I was thinking about using for loop but not able to make it work, below you will find the problem statement as well the requested output.

Thanks is Advance :slight_smile:

The JSON for the Parent Table (Index 1) looks something like below

// My Code Here
    "doc": {
      "RequestID": 1, //Primary Key
      "Name": "John",
      "Gender": "Male"
      }

      "doc": {
      "RequestID": 2,
      "Name": "Anna"
      "Gender": "Female"
      }

The JSON for Child Table (Index 2) currently looks something like below -


// My Code here

    "doc": {
      "RequestID": 1,
      "ID": 1 //Primary Key
      }

    "doc": {
      "RequestID": 1,
      "ID": 2
      }

      "doc": {
      "RequestID": 2,
      "ID": 3
      }

    "doc": {
      "RequestID": 2,
      "ID": 4
      }

Output Required for Index 2


    "doc": {
      "RequestID": 1,
      "ID": 1 //Primary Key
      "Name": "John",
      "Gender": "Male"
      }

    "doc": {
      "RequestID": 1,
      "ID": 2,
      "Name": "John",
      "Gender": "Male"
      }

      "doc": {
      "RequestID": 2,
      "ID": 3,
      "Name": "Anna"
      "Gender": "Female"
      }

    "doc": {
      "RequestID": 2,
      "ID": 4,
      "Name": "Anna"
      "Gender": "Female"
      }

I would the index 1 table into something like a CSV that a translate filter can load (you could do that using logstash, obviously, if you cannot write a query that pulls the data in the right format). You can only have two columns in a CSV for a translate so you would need a data format like

1,Name=John Gender=Male

or have JSON of the entire row as the second column

2, { "RequestID": 2, "Name": "Anna", "Gender": "Female" }

Then process index 2, and enrich the events by looking up the [RequestId] using the translate filter. Then use a kv or json filter to parse the translated field as appropriate.

The above is only a sample data plus translate to csv is not feasible because we are dealing with couchDB changes which fetches the real time data.

I was wondering if there are any filters or a conditional query inside that i could use within filters

But i will give this a try "Then process index 2, and enrich the events by looking up the [RequestId] using the translate filter. Then use a kv or json filter to parse the translated field as appropriate."

I am practically lost :grimacing:...
The data comes in from couchDB database (change) and it then has to be processed as explained.

if you cannot write a query that pulls the data in the right format - I am trying to write it ain't working, I would actually prefer this.

if those tables have been indexed in different indices in ES :

  1. use elasticsearch input to query index 2 (RequestID, ID). this will create event in logstash (current event)
  2. use elasticsearch filter to query index 1 (RequestID, name, gender), then compare the RequestID value with the value of the same field from input above
  3. when match, add name and gender from (2) to the (current event).
  4. set the output to elasticsearch with action update. this will update the documents received during input phase instead of creating new documents

if both tables coming in real time, there is a chance that event with matching RequestID are not yet indexed in ES. so you might want to schedule it (elasticsearch input support scheduling). it won’t be real time but that’s due to your data are coming in parallel, so it’s not yet stored anywhere to be queried.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.