Enrich fields in new index from other index present in kibana using logstash

Divyank_Mahalle · September 16, 2021, 12:29pm

Hi ,

I am using logstash to create a new index from csv file which is having the index name-"therapy"

Another index is "participation" already present in kibana with other fields.

Both participation and therapy index is haiving '"id" field in common.

I am able to built a conf file to ingest therapy data to kibana.
I want some ideas/hints on how to add fields from the participation index when therapy and participation field "id" got matched. and enriched therapy docs with participation fields to the therapy index.

Thank you in advance.

Current conf file-

input {
  file {
    path => "/home/data.csv"
    start_position => "beginning"
  }
}
filter {
  csv {
      separator => ","
      skip_header => "true"
      columns => ["user_id","DeviceSerialNumber"]
  }
 elasticsearch {
              hosts => "<>"
              index => "participation"
              query => "id:\"%{[id]}\"
#Fields to add from participation index to therapy              
                    fields => { 
                    "Month" => "Month",
                    "cost" => "cost" 
                   }
        }
}
output{
stdout {}
#output {
 #  elasticsearch {
  #   hosts => "<>"
   #  index => "therapy"
 # }
}

Divyank_Mahalle · September 20, 2021, 11:10am

Hi,

I checked with enrich processor to enrich fields from one index "participation" to another index "therapy" and stored consolidated fields into 3 rd index '"data-participation-therapy1"

"user_id" is the common field between two indexes.

create an enrich policy-

PUT /_enrich/policy/therapy-policy
{
  "match": {
    "indices": "therapy",
    "match_field": "user_id",
    "enrich_fields": [<.....>]
  }
}

execute that policy in order to create an enrichment index

POST /_enrich/policy/therapy-policy/_execute

{
  "status" : {
    "phase" : "COMPLETE"
  }
}

PUT /_ingest/pipeline/therapy_lookup
{
  "description" : "Enriching data-participation with therapy data",
  "processors" : [
    {
      "enrich" : {
        "policy_name": "therapy-policy",
        "field" : "user_id",
        "target_field": "tmp",
        "max_matches": "1"
      }
    },
    {
      "script": {
        "if": "ctx.tmp != null",
        "source": "ctx.putAll(ctx.tmp); ctx.remove('tmp');"
      }
    }
  ]
}
*************
{
  "acknowledged" : true
}
***************

Getting error while reindex the cosolidated data into "data-participation-therapy1" index

POST _reindex
{
  "source": {
    "index": "-participation"
  },
  "dest": {
    "index": "data-participation-therapy1",
    "pipeline": "therapy_lookup"
  }
}

output-
{"ok":false,"message":"backend closed connection"}

Ref link-https://stackoverflow.com/questions/64008094/combine-two-index-into-third-index-in-elastic-search-using-logstash

@val

val · September 22, 2021, 9:41am

backend closed connection simply means that the client (i.e. Kibana Dev Tools in your browser) timed out. But the reindex process is still ongoing in the background.

If your source index is big, odds are high that it will take longer than the timeout for the operation to terminate. So you should start your reindex to run asynchronously using

POST _reindex?wait_for_completion=false

The call will return immediately and give you a task ID which you can use to check the task status as it progresses using

GET _tasks/<task_id>

Divyank_Mahalle · September 23, 2021, 10:47am

Thanks, Val for the reply.

participation index consist of 80k docs and therapy index contains 9k docs with ''user_id" field in comman.

I want to know if enrichment processor scans docs 1 to 1 for user id in both indexes.
OR Enrichment processor will scan all docs present in participation index for particular "user id" in therapy index and it will enrich stored docs in new index.

If docs get 1 to 1 that is the user id of 1st doc in therapy is check with user id participation in 1 st then hardly any docs will get enriched.

Enriching process described here-Enrichment processor working fine on demo index

val · September 23, 2021, 11:04am

The reindex API will iterate over all docs of the source index (i.e. participation) and for each will retrieve the doc with the corresponding user_id from the enrichment index (i.e. therapy) and will add what it finds to the source document and store the result in the destination index (i.e. data-participation-therapy1).

Divyank_Mahalle · September 23, 2021, 2:05pm

Thanks Val, It's a Success.

Index pattern showing 49 fields which is a combination of participation and therapy fields.

Two Queries-

But In Discover section I am seeing only 21 fields which is from participation index.Therapy index fields is not present in data-participation-therapy1 index.
I am able to query all fields in dev tools dsl query and mapping also showing all 49 fields.

21 fields in discovery tab of data-participation-therapy1 index.

Index pattern showing 49 fields-

2,Field "DayDateStorage" is date field in "therapy" index but in "data-participation-therapy1" it's showing text , have to changed it to date.

Update-

Both Queries are solved. I place mapping of data-participation-therapy1 index before reindexing data-participation-therapy1. Discover field showing all fields and "DayDateStorage" is converted to date type due to changes in mapping of index.

Thanks.

Divyank_Mahalle · September 27, 2021, 1:16pm

One Query-

As the participation and therapy index is getting updated once in a day.(live pipelines)
How to enrich the new docs into the enriched index "data-participation-therapy1" index.

Ref link-What is the best way to enrich near real-time data in ElasticSearch with batch data that may come in later? - Stack Overflow
I got this link, but still not got the clarity.

system · October 25, 2021, 1:17pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kibana - Visualize fields added by Enrich processor Kibana	3	380	June 1, 2020
Reindex + enrich data Logstash	3	665	January 17, 2019
How to compare data on two different index bassed on common field and update into 3 index Kibana	3	11	November 6, 2024
Enrich documents by copying fields from another index Elasticsearch	11	4617	November 4, 2022
Add new field and updating index by changes in another index Logstash	10	1885	August 5, 2018

Enrich fields in new index from other index present in kibana using logstash

Related topics