How to merge parent and child documents in Elasticsearch into a single enriched record?

Hi everyone,

I’m working with an Elasticsearch index that stores HTTP request logs. Each request is represented by two documents:

  • A parent document with a unique ID field.
  • A child document that contains a parentID field referencing the parent’s ID.

Both documents share the same structure (i.e., same fields), but with different values. I’d like to merge each parent-child pair into a single document, either in the same index or a new one. Ideally, the merged document would:

  • Retain all fields from the parent as-is.
  • Include all fields from the child, but with a child_ prefix (e.g., child_status, child_timestamp, etc.).

What’s the best way to achieve this in Elasticsearch/Kibana?

  • Should I use a transform job, ingest pipeline, or script with _update_by_query?
  • Is there a way to do this efficiently for large datasets?

Thanks in advance for your help!

anyone ?

Hello @verza

Welcome to the community.

Looking at your usecase as per my understanding for large datasets you should go for transform job on common field id & can include the fields as per the requirement.

Thanks!!

Can i still do it if the common field id are different ? In the parent the field is named 'ID' and in the Child is named 'ParentID'

Hello @verza

In this scenario we might have to use an enrich policy by which data for 1 index is appended to index 2 based on ID field.

Thanks!!

Thanks for your help @tortoise; i had a look at enrich policy; i understand that for the Enrich Policy i need to have one common attribute ( match field) ; in my case again I've two different attributes ID and ParentID names to match, i don't see how i can specify that.

Hello @verza

I did try below enrich policy :

PUT index-a
{
  "mappings": {
    "properties": {
      "parent_id": { "type": "keyword" },
      "name":      { "type": "text" },
      "age":       { "type": "integer" }
    }
  }
}
PUT index-b
{
  "mappings": {
    "properties": {
      "secondary_id": { "type": "keyword" },
      "location":     { "type": "text" },
      "technology":   { "type": "text" }
    }
  }
}

Created enrich policy with source as index-b (we can select source as the primary index in this case i am selecting the secondary index)

PUT /_enrich/policy/enrich-by-secondary-id
{
  "match": {
    "indices": "index-b",
    "match_field": "secondary_id",
    "enrich_fields": ["location", "technology"]
  }
}

POST /_enrich/policy/enrich-by-secondary-id/_execute

PUT /_ingest/pipeline/enrich-pipeline
{
  "processors": [
    {
      "enrich": {
        "policy_name": "enrich-by-secondary-id",
        "field": "parent_id",
        "target_field": "enriched_data",
        "max_matches": 1,
        "override": true
      }
    }
  ]
}

Now while indexing the data for index-a it will call the enrich policy & add matching data from index-b :

POST /index-a/_doc?pipeline=enrich-pipeline
{
  "parent_id": "id003",
  "name": "Zara",
  "age": 29
}
GET index-a/_search
{
  "query": {
    "match": {
      "parent_id": "id003"
    }
  }
}

Output :


  {
        "_index": "index-a",
        "_id": "t11fpZcBjyPthJy0yx5H",
        "_score": 1.0296195,
        "_source": {
          "parent_id": "id003",
          "enriched_data": {
            "location": "New York",
            "technology": "Logstash",
            "secondary_id": "id003"
          },
          "name": "Zara",
          "age": 29
        }
      }
    ]
  }
}

Please share the issue you are facing?

Thanks!!

Hi @Tortoise, thanks for your help!

My problem is that, if i manually add the documents into the index and trigger the Policy then is all good and works as expected; but data is coming in as part of Elastic integration with Cloudflare Cloudflare Integration | Elastic integrations
and there i have no way to trigger the Policy, that's where i'm stuck.