Question on ingest pipeline split operation

For a given string {a}{b}{c} I want output as {a} {b} {c} as seperate outputs.
But what I get is {a, b and }c.

The pipeline

PUT _ingest/pipeline/my_pipeline
{
  "processors": [
	{
	  "split": {
		"field": "raw_string",
		"separator": "\\}\\{"
	  }
	}
  ]
}

The simuate part:

POST _ingest/pipeline/my_pipeline/_simulate
{
  "docs":[
	{
	  "_source":{
		"raw_string" : "{s}{a}{d}"
	  }
	}
	]
}

The sad out put I get:

{
  "docs" : [
	{
	  "doc" : {
		"_index" : "_index",
		"_type" : "_type",
		"_id" : "_id",
		"_source" : {
		  "raw_string" : [
			"{s",
			"a",
			"d}"
		  ]
		},
		"_ingest" : {
		  "timestamp" : "2019-03-04T06:00:28.739Z"
		}
	  }
	}
  ]
}

Any ideas?

Hi,
if you don't want to create a custom script for this, here is a very simple idea:

PUT _ingest/pipeline/my_pipeline
{
  "processors": [
    {
      "gsub": {
        "field": "raw_string",
        "pattern": "\\{",
        "replacement": ""
      }
    },
    {
      "split": {
        "field": "raw_string",
        "separator": "}"
      }
    }
  ]
}
1 Like

May be this?

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "test",
    "processors": [
      {
        "split": {
          "field": "raw_string",
          "separator": "\\}\\{|\\{|\\}"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_type": "_doc",
      "_id": "id",
      "_source": {
        "raw_string": "{s}{a}{d}"
      }
    }
  ]
}

It gives:

{
  "docs" : [
    {
      "doc" : {
        "_index" : "index",
        "_type" : "_doc",
        "_id" : "id",
        "_source" : {
          "raw_string" : [
            "",
            "s",
            "a",
            "d"
          ]
        },
        "_ingest" : {
          "timestamp" : "2019-03-04T18:07:40.378708Z"
        }
      }
    }
  ]
}

Thanks for the reply. Got me started off.

Thanks for reply. It is working now.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.