@BigFunger, @Bargs, and @talevy had a discussion on Zoom earlier today, and this… is one of the issues that we discussed.
### Summary
I would like to see the foreach processor reworked so that it only accepts a single processor instead of an array of processors as it does now.
### Background
I have been working on the UI for ingest pipelines, and specifically I have been trying to implement the foreach processor. The UI uses the verbose setting on the simulate API to report back to the user.
This way, when the user add or edits a processor, I can use the output from the parent processor as the input of the next and provide them with the data necessary to build out their processors. This is a problem in the context of the foreach processor because I can't provide the user with the input and output of each of the processors defined within the foreach processor.
Per our discussion, it also make sense to structure the foreach processor in this way because it follows the patterns that have been established in the other processors. For example, if you want to apply an `uppercase` processor to more than one field in the document, you need to create one `uppercase` processor for each field you want to act on.
### Example
With a pipeline definition of the following:
``` json
{
"pipeline": {
"description": "",
"processors": [
{
"split": {
"tag": "processor_1",
"field": "message",
"separator": " "
}
},
{
"foreach": {
"tag": "processor_2",
"field": "message",
"processors": [
{
"uppercase": {
"tag": "processor_3",
"field": "_value"
},
"lowercase": {
"tag": "processor_4",
"field": "_value"
},
"uppercase": {
"tag": "processor_5",
"field": "_value"
}
}
]
}
}
]
},
"docs": [
{
"_source": {
"message": "these are the words of a sentence"
}
}
]
}
```
I would expect the following output:
``` json
{
"docs": [
{
"processor_results": [
{
"tag": "processor_1",
"doc": {
"_type": "_type",
"_id": "_id",
"_index": "_index",
"_source": {
"message": [
"these",
"are",
"the",
"words",
"of",
"a",
"sentence"
]
},
"_ingest": {
"timestamp": "2016-07-06T21:27:14.585+0000"
}
}
},
{
"tag": "processor_3",
"doc": {
"_type": "_type",
"_id": "_id",
"_index": "_index",
"_source": {
"message": [
"THESE",
"ARE",
"THE",
"WORDS",
"OF",
"A",
"SENTENCE"
]
},
"_ingest": {
"timestamp": "2016-07-06T21:27:14.585+0000"
}
}
},
{
"tag": "processor_4",
"doc": {
"_type": "_type",
"_id": "_id",
"_index": "_index",
"_source": {
"message": [
"these",
"are",
"the",
"words",
"of",
"a",
"sentence"
]
},
"_ingest": {
"timestamp": "2016-07-06T21:27:14.585+0000"
}
}
},
{
"tag": "processor_5",
"doc": {
"_type": "_type",
"_id": "_id",
"_index": "_index",
"_source": {
"message": [
"THESE",
"ARE",
"THE",
"WORDS",
"OF",
"A",
"SENTENCE"
]
},
"_ingest": {
"timestamp": "2016-07-06T21:27:14.585+0000"
}
}
},
{
"tag": "processor_2",
"doc": {
"_type": "_type",
"_id": "_id",
"_index": "_index",
"_source": {
"message": [
"THESE",
"ARE",
"THE",
"WORDS",
"OF",
"A",
"SENTENCE"
]
},
"_ingest": {
"timestamp": "2016-07-06T21:27:14.585+0000"
}
}
}
]
}
]
}
```
Instead, I get this back:
``` json
{
"docs": [
{
"processor_results": [
{
"tag": "processor_1",
"doc": {
"_id": "_id",
"_type": "_type",
"_index": "_index",
"_source": {
"message": [
"these",
"are",
"the",
"words",
"of",
"a",
"sentence"
]
},
"_ingest": {
"timestamp": "2016-07-08T21:36:14.400+0000"
}
}
},
{
"tag": "processor_2",
"doc": {
"_id": "_id",
"_type": "_type",
"_index": "_index",
"_source": {
"message": [
"these",
"are",
"the",
"words",
"of",
"a",
"sentence"
]
},
"_ingest": {
"timestamp": "2016-07-08T21:36:14.400+0000"
}
}
}
]
}
]
}
```