Enrich data using ingest pipeline and enrich policy override data

Hi,
I want to enrich data in an index (add nationality of authors in books index) using data in another index (authors). I use an ingest pipeline with an enrich policy.

Here is a simple example :

Indices

PUT books
{
  "mappings": {
    "properties": {
      "title": {
        "type": "keyword"
      },
      "authors": {
        "type": "nested",
        "properties": {
          "code": {
            "type": "keyword"
          },
          "name": {
            "type": "keyword"
          },
          "page_written": {
            "type": "integer"
          },
          "nationality": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

PUT authors
{
  "mappings": {
    "properties": {
      "code": {
        "type": "keyword"
      },
      "name": {
        "type": "keyword"
      },
      "nationality": {
        "type": "keyword"
      }
    }
  }
}

Some authors :

PUT authors/_doc/a1
{
  "code": "a1",
  "name": "John",
  "nationality": "french"
}

PUT authors/_doc/a2
{
  "code": "a2",
  "name": "Jack",
  "nationality": "english"
}

Policy and pipeline :

PUT /_enrich/policy/enrich_author_policy
{
  "match": {
    "indices": "authors",
    "match_field": "code",
    "enrich_fields": [
      "nationality"
    ]
  }
}

POST /_enrich/policy/enrich_author_policy/_execute

PUT /_ingest/pipeline/enrich_author_pipeline
{
  "processors": [
    {
      "foreach": {
        "field": "authors",
        "processor": {
          "enrich": {
            "policy_name": "enrich_author_policy",
            "field": "_ingest._value.code",
            "target_field": "_ingest._value"
          }
        }
      }
    }
  ]
}

Then I add a books using my pipeline :

PUT books/_doc/b1?pipeline=enrich_author_pipeline
{
  "title": "Programming 101",
  "authors": [
    {
      "code": "a1",
      "name": "John",
      "page_written": 120
    },
    {
      "code": "a2",
      "name": "Jack",
      "page_written": 113
    }
  ]
}

Nationality for each authors is retrieve but the data in books index is overriden instead of enriched.

Result :

{
  "title" : "Programming 101",
  "authors" : [
	{
	  "nationality" : "french",
	  "code" : "a1"
	},
	{
	  "nationality" : "english",
	  "code" : "a2"
	}
  ]
}

Expected result :

{
  "title" : "Programming 101",
  "authors" : [
	{
      "code": "a1",
      "name": "John",
      "page_written": 120,
	  "nationality" : "french"
    },
    {
      "code": "a2",
      "name": "Jack",
      "page_written": 113,
	  "nationality" : "english"
    }
  ]
}

Is there a way to not override existing data ?
I know I can add field like the example in documentation, but I got a weird result with duplicate information :

{
  "title" : "Programming 101",
  "authors" : [
	{
      "code": "a1",
      "name": "John",
      "page_written": 120,
	  "infos": {
		"code": "a1",
		"nationality" : "french"
	  }
    },
    {
      "code": "a2",
      "name": "Jack",
      "page_written": 113,
	  "infos": {
		"code": "a1",
		"nationality" : "english"
	  }
    }
  ]
}

Thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.