I'm using the pipeline in ingest an array of documents
PUT _ingest/pipeline/attachment
{
"description": "Extract attachment information",
"processors": [
{
"foreach": {
"field": "attachments",
"processor": {
"attachment": {
"field": "_ingest._value.data",
"target_field": "_ingest._value.attachment",
"properties": [ "content" ]
}
}
}
}
]
}
As a sample
PUT jh_index/my_type/my_id?pipeline=attachment
{
"attachments":
[
{
"data": "_encoded document - large amount of base64 guff_"},
{
"data": "_another encoded document - even more base64 guff_"}
]
}
Works like a champ however what i end up with is...
{
"_index": "jh_index",
"_type": "my_type",
"_id": "my_id",
"_version": 15,
"found": true,
"_source": {
"attachments": [
{
"data": "**_Large amount of base 64 guff I don't want_**",
"attachment": {
"content": "NEW HEADING1\nLorem ipsum dolor ......."
}
},
{
"data": "**_Another large amount of base 64 guff I don't want_**",
"attachment": {
"content": "HEADING1\nClick Insert and then choose the ........."
}
}
]
}
}
I've tried using "processors" rather than "processor" so that I can add a
{
"remove": {"field": "_ingest._value.data"}
}
but that seems to have been developed out on purpose Modify foreach processor to accept a single processor instead of collection #19345
I don't seem to be able to have 2 "foreach" one after the other in a pipeline
How do I remove the attachments.data field?
Thx
J/.