Can I apply split processor on more than one field?

I have a number on fields that I want to perform the same operation on in a reindex task:

  • split values in a string to an array field,
  • I want to do the same thing on several source fields.

Here is my reindex command

POST _reindex
{
  "source": {
    "index": "410"
  },
  "dest": {
    "index": "411",
    "pipeline": "split1"
  }
}

Here is the current pipeline that works to split values from the q5 field in index 410. So that works fine.

PUT _ingest/pipeline/split1
{
  "processors": [
    {
      "split": {
        "field": "q5",
        "separator": "[a-z][.] "
      }
    }
  ]
}

Now I would like to do the same to e.g. fields q6 and q7... but the split processor doesn't allow me to specify e.g. "field": ["q5","q6","q7"]

How would realize this in another way?

What about this?

PUT _ingest/pipeline/split1
{
  "processors": [
    {
      "split": {
        "field": "q5",
        "separator": "[a-z][.] "
      }
    }, {
      "split": {
        "field": "q6",
        "separator": "[a-z][.] "
      }
    }
  ]
}

Great, works fine I realize now (did actually test that already).
Didn't fully grasp the error messages I received due to missing q6 fields in some documents, only the document containing both q5 and q6 got reindexed....

There is an "ignore_missing": true option that I just found, which solved the problem.
This is what it looks like now:

PUT _ingest/pipeline/split1
{
  "processors": [
    {
      "split": {
        "field": "q5",
        "separator": "[a-z][.] ",
        "ignore_missing": true
      }
    }, {
      "split": {
        "field": "q6",
        "separator": "[a-z][.] ",
        "ignore_missing": true
      }
    }
  ]
}

Thx for your fast reply, can stick with my plan :wink:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.