Is this expected behaviour of split processor in a pipeline?

I am on ES 7.2.0

I create a simple pipeline:

PUT _ingest/pipeline/test_pipeline
{
  "description": "test",
  "processors": [
    {
      "split": {
        "field": "message",
        "target_field": "splitdata",
        "separator": ","
      }
    }
  ]
}

Then I test it.

GET _ingest/pipeline/test_pipeline/_simulate
{
  "docs": [
    {
      "_source" :{
        "message" : "A,,B,,"
      }
    }
  ]
}

Results come back like this:

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message" : "A,,B,,",
          "splitdata" : [
            "A",
            "",
            "B"
          ]
        },
        "_ingest" : {
          "timestamp" : "2019-10-23T04:25:26.277Z"
        }
      }
    }
  ]
}

I was expecting two empty fields after the character 'B'.

With an input like this:

GET _ingest/pipeline/test_pipeline/_simulate
{
  "docs": [
    {
      "_source" :{
        "message" : "A,,B,,C"
      }
    }
  ]
}

I get the expected fields.

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message" : "A,,B,,C",
          "splitdata" : [
            "A",
            "",
            "B",
            "",
            "C"
          ]
        },
        "_ingest" : {
          "timestamp" : "2019-10-23T04:27:38.400Z"
        }
      }
    }
  ]
}

Edit: Not very hopeful now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.