Ingest pipeline should work based on conditions

Hi i have created a ingest pipeline using grok processor below is my processor

Ingest pipeline with grok

      PUT _ingest/pipeline/dissectpipeline
        {
          "description" : "split message content",
          "processors": [
            {
              "grok": {
                  "field": "message",
                  "patterns": ["agentId:%{NOTSPACE:agentId}","statuscode:%{NOTSPACE:statuscode}"]
                }
            }
          ]
        }

Pipeline simulation example

    POST /_ingest/pipeline/dissectpipeline/_simulate
    {
      "docs": [
        {
          "_index": "index",
          "_id": "id",
          "_source": {
            "message": "agentId:F00356"
          }
        },
        {
          "_index": "index",
          "_id": "id",
          "_source": {
             "message": "statuscode:400"
          }
        },
        {
          "_index": "index",
          "_id": "id",
          "_source": {
             "message": "log event SQL Success:200"
          }
        }
      ]
    }

For first and second document it is working as expected but for my 3 rd document it's throwing error grok pattern not matched. Same happening while indexing my real time documents.

Here my requirement is only if the message field of a document contains specific text it should grok else it should not consider for grok ultimately the thing is while calling this pipeline in my filebeat all the document should get indexed in elasticsearch only for few documents which contains specific text like agentId, statuscode, serviceclient in message field it should get groked and created respective field with in the same document. Kindly someone help me to achieve this. Thanks in advance.

Note : In all my incoming documents message field should not contain grok pattern text. lot more varies.

If any clarifications needed let me know.

Hi Guys, I have done something like this and it seems to be working fine.

**Pipeline**

    PUT _ingest/pipeline/dissectpipeline
        {
          "description" : "split message content",
          "processors": [
            {
              "grok": {
                  "field": "message",
                  "patterns": ["CSIC_agentId:%{NOTSPACE:apm_agentId.agentId}","CSIC_statuscode:%{NOTSPACE:apm_statuscode.statuscode}"
                    ,"CSIC_servicename:%{NOTSPACE:apm_servicename.servicename}"]
                }
            }
          ]
        }

pipeline with conditions

    PUT _ingest/pipeline/logs_pipeline
    {
      "description": "A pipeline of pipelines for log files",
      "version": 1,
      "processors": [
        {
          "pipeline": {
            "if": "ctx.message.toLowerCase().contains('csic_agentid')||ctx.message.toLowerCase().contains('csic_statuscode')||ctx.message.toLowerCase().contains('csic_servicename')",
            "name": "dissectpipeline"
          }
        }
      ]
    }

Could some one let me know is this a right approach?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.