Hi,
I want to enrich data in an index (add nationality of authors in books index) using data in another index (authors). I use an ingest pipeline with an enrich policy.
Here is a simple example :
Indices
PUT books
{
"mappings": {
"properties": {
"title": {
"type": "keyword"
},
"authors": {
"type": "nested",
"properties": {
"code": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"page_written": {
"type": "integer"
},
"nationality": {
"type": "keyword"
}
}
}
}
}
}
PUT authors
{
"mappings": {
"properties": {
"code": {
"type": "keyword"
},
"name": {
"type": "keyword"
},
"nationality": {
"type": "keyword"
}
}
}
}
Some authors :
PUT authors/_doc/a1
{
"code": "a1",
"name": "John",
"nationality": "french"
}
PUT authors/_doc/a2
{
"code": "a2",
"name": "Jack",
"nationality": "english"
}
Policy and pipeline :
PUT /_enrich/policy/enrich_author_policy
{
"match": {
"indices": "authors",
"match_field": "code",
"enrich_fields": [
"nationality"
]
}
}
POST /_enrich/policy/enrich_author_policy/_execute
PUT /_ingest/pipeline/enrich_author_pipeline
{
"processors": [
{
"foreach": {
"field": "authors",
"processor": {
"enrich": {
"policy_name": "enrich_author_policy",
"field": "_ingest._value.code",
"target_field": "_ingest._value"
}
}
}
}
]
}
Then I add a books using my pipeline :
PUT books/_doc/b1?pipeline=enrich_author_pipeline
{
"title": "Programming 101",
"authors": [
{
"code": "a1",
"name": "John",
"page_written": 120
},
{
"code": "a2",
"name": "Jack",
"page_written": 113
}
]
}
Nationality for each authors is retrieve but the data in books index is overriden instead of enriched.
Result :
{
"title" : "Programming 101",
"authors" : [
{
"nationality" : "french",
"code" : "a1"
},
{
"nationality" : "english",
"code" : "a2"
}
]
}
Expected result :
{
"title" : "Programming 101",
"authors" : [
{
"code": "a1",
"name": "John",
"page_written": 120,
"nationality" : "french"
},
{
"code": "a2",
"name": "Jack",
"page_written": 113,
"nationality" : "english"
}
]
}
Is there a way to not override existing data ?
I know I can add field like the example in documentation, but I got a weird result with duplicate information :
{
"title" : "Programming 101",
"authors" : [
{
"code": "a1",
"name": "John",
"page_written": 120,
"infos": {
"code": "a1",
"nationality" : "french"
}
},
{
"code": "a2",
"name": "Jack",
"page_written": 113,
"infos": {
"code": "a1",
"nationality" : "english"
}
}
]
}
Thanks.