How to select or return specific part from Split ingest pipeline

I have created one split ingest pipeline, it is working fine. But I want to return only the first part of the text.

GET _ingest/pipeline/_simulate 
{
  "pipeline":{
    "processors":[
      {
      "split":{
        "field":"message",
        "separator": ";"
      }
    }
    ]  
  },
  "docs":[
    {
      "_source":{
        "message":"SAS Grant Suspension Failure;6825;serial_no.=EB184812696;supplAlarmInfo:SUSPENDED_GRANT;additionalFaultId=682"
    }
  }
    
    ]
}

This is the output

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message" : [
            "SAS Grant Suspension Failure",
            "6825",
            "serial_no.=EB184812696",
            "supplAlarmInfo:SUSPENDED_GRANT",
            "additionalFaultId=682"
          ]
        },
        "_ingest" : {
          "timestamp" : "2021-08-01T11:01:55.937955952Z"
        }
      }
    }
  ]
}

I want to return "SAS Grant Suspension Failure" only and store it in a new field,
Please help me with this. Thank you

You can probably use a dissect or grok processor instead.

What would be the pattern to extract SAS Grant Suspension Failure only?
Struggling with this.

I figured out the pattern

{
"patterns":["%{INTERESTED_PART:new_field}"],
        "pattern_definitions" : {
          "INTERESTED_PART" : "SAS Grant Suspension Failure"
        }
      }

I tested the ingest pipeline it's working fine. . But want to know how does it will automatically process this ingest and create a new field in the index?

What about:

GET _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "dissect": {
          "field": "message",
          "pattern": "%{label};%{?text}"
        }
      },
      {
        "remove": {
          "field": "message"
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "message": "SAS Grant Suspension Failure;6825;serial_no.=EB184812696;supplAlarmInfo:SUSPENDED_GRANT;additionalFaultId=682"
      }
    }
  ]
}

This gives:

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "label" : "SAS Grant Suspension Failure"
        },
        "_ingest" : {
          "timestamp" : "2021-08-02T08:09:03.046277011Z"
        }
      }
    }
  ]
}

Thanks, this also works. The only thing I am struggling with is, how to attach this ingest to the index? Do I need to mention pipeline somewhere in the configuration file? What's the process?

Once you have created the pipeline, you can use it from the index API:

POST index/_doc?pipeline=your-pipeline-name
{
  "message": "SAS Grant Suspension Failure;6825;serial_no.=EB184812696;supplAlarmInfo:SUSPENDED_GRANT;additionalFaultId=682"
}

Or you can define a default pipeline. See index.default_pipeline in Index modules | Elasticsearch Guide [7.13] | Elastic.

Thank you @dadoonet

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.