Elastic Ingest with multiple grok processors

JamesFx · January 3, 2017, 3:27pm

Happy New Year everyone!

I am trying to configure an ingest pipeline and I want to apply grok to two distinct fields. I couldn't find a way to do this. Does anyone have a suggestion on how to accomplish this.

Thank you!

dadoonet · January 3, 2017, 3:40pm

You can add 2 grok processors (one for each field) in the pipeline.

JamesFx · January 3, 2017, 4:06pm

That was my first try.

I am using version 5.1 and it only takes into account the last grok.

"processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{STATUS:status};\\s+%{WORD:service};\\s+%{MSG:message}\\|\\s+%{WORD:key01}=%{NUMBER:value01}.*",
          "%{STATUS:status};\\s+%{WORD:service};\\s+%{GREEDYDATA:message}"
        ],
        "pattern_definitions": {
          "STATUS": "\\d+",
          "MSG": ".+?"
        }
      },
      "grok": {
          "field": "source",
          "patterns": [".+?%{TIMESTAMP:timestamp}.+"],
          "pattern_definitions" : {
            "TIMESTAMP" : "[0-9]+"}
        }
      ,
      "date" : {
        "field" : "timestamp",
        "formats" : ["UNIX_MS"]
      },
      "remove": {
        "field": "timestamp"
      }
    }
  ]

dadoonet · January 3, 2017, 4:24pm

Can you check with verbose parameter if only one grok is actually applied?

See https://www.elastic.co/guide/en/elasticsearch/reference/current/simulate-pipeline-api.html#ingest-verbose-param

JamesFx · January 3, 2017, 6:10pm

My processors:

"processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{STATUS:status};\\s+%{WORD:service};\\s+%{MSG:message}\\|\\s+%{WORD:key01}=%{NUMBER:value01}.*",
          "%{STATUS:status};\\s+%{WORD:service};\\s+%{GREEDYDATA:message}"
        ],
        "pattern_definitions": {
          "STATUS": "\\d+",
          "MSG": ".+?"
        }
      },
      "grok": {
        "field": "source",
        "patterns": [
          ".+?%{TIMESTAMP:timestamp}.+"
        ],
        "pattern_definitions": {
          "TIMESTAMP": "[0-9]+"
        }
      },
      "date": {
        "field": "timestamp",
        "formats": [
          "UNIX_MS"
        ]
      }
    }
  ]

My Doc:

  "docs": [
    {
      "_index": "filebeat-test-2017.01.03",
      "_type": "log",
      "_id": "AVlkvVA78rJBRllM9MQ-",
      "_source": {
        "source": "/opt/shared-nagios/nagios_file_1483453815014.txt",
        "message": "0; SERVICE_A; online | time=1s",
        "@timestamp": "2017-01-03T14:30:15.311Z"
        }
      ,
      "fields": {
        "@timestamp": [
          1483453815311
        ]
      },
      "sort": [
        1483453815311
      ]
    }
  ]

My result:

{
  "docs": [
    {
      "processor_results": [
        {
          "doc": {
            "_id": "AVlkvVA78rJBRllM9MQ-",
            "_index": "filebeat-test-2017.01.03",
            "_type": "log",
            "_source": {
              "@timestamp": "2017-01-03T14:30:15.311Z",
              "source": "/opt/shared-nagios/nagios_file_1483453815014.txt",
              "message": "0; SERVICE_A; online | time=1s",
              "timestamp": "1483453815014"
            },
            "_ingest": {
              "timestamp": "2017-01-03T18:10:23.455+0000"
            }
          }
        },
        {
          "doc": {
            "_id": "AVlkvVA78rJBRllM9MQ-",
            "_index": "filebeat-test-2017.01.03",
            "_type": "log",
            "_source": {
              "@timestamp": "2017-01-03T14:30:15.014Z",
              "source": "/opt/shared-nagios/nagios_file_1483453815014.txt",
              "message": "0; SERVICE_A; online | time=1s",
              "timestamp": "1483453815014"
            },
            "_ingest": {
              "timestamp": "2017-01-03T18:10:23.455+0000"
            }
          }
        }
      ]
    }
  ]
}

dadoonet · January 3, 2017, 6:34pm

I think it should be:

{
   "processors":[
      {
         "grok":{
            "field":"message",
            "patterns":[
               "%{STATUS:status};\\s+%{WORD:service};\\s+%{MSG:message}\\|\\s+%{WORD:key01}=%{NUMBER:value01}.*",
               "%{STATUS:status};\\s+%{WORD:service};\\s+%{GREEDYDATA:message}"
            ],
            "pattern_definitions":{
               "STATUS":"\\d+",
               "MSG":".+?"
            }
         }
      },
      {
         "grok":{
            "field":"source",
            "patterns":[
               ".+?%{TIMESTAMP:timestamp}.+"
            ],
            "pattern_definitions":{
               "TIMESTAMP":"[0-9]+"
            }
         }
      },
      {
         "date":{
            "field":"timestamp",
            "formats":[
               "UNIX_MS"
            ]
         }
      }
   ]
}

JamesFx · January 3, 2017, 6:54pm

OMG... I am such a noob....

Sorry for that!

Thank you!

system · January 31, 2017, 6:54pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ingest - grok - more fields by same processor, domain parsing Elasticsearch	1	519	July 12, 2017
Unexpected result for grok processor with multiple patterns Elasticsearch	1	599	December 23, 2016
Grok processor - Multiple matches of the same pattern Elasticsearch ingest-pipeline	1	226	November 28, 2022
How to use more than 1 pattern in pipelines? Elasticsearch	2	439	February 21, 2017
Multiple Grok Processors in pipeline: "[processors] required property is missing" Elasticsearch	2	1470	July 10, 2020

Elastic Ingest with multiple grok processors

Related topics