Enrich filebeat content using ingest pipeline


#1

I want to use filebeat to send Oracle alert logs to Elasticsearch. So far I've managed to get the data in (using some multiline magic). Ideally I want to use a pipeline on the ES side to extract the date, errors (ORA-), keywords etc. from the message and have them as separate fields in the indexes. I vaguely remember seeing something similar demonstrated at Elastic{ON} London, wish I'd been paying more attention!

If anyone knows if/how this can be done, I'd really appreciate some guidance.

Many thanks,
Dave

EDIT: OK, so I found some documentation: https://www.elastic.co/guide/en/elasticsearch/reference/master/grok-processor.html

So now I have this:

post _ingest/pipeline/_simulate
{
  "pipeline": {
  "description" : "grok test",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["%{DATESTAMP_OTHER:ora-date}"]
      }
    }
  ]
},
"docs":[
  {
    "_source": {
      "message": "Wed Aug 26 16:29:35 2015\nStarting background process CJQ0"
    }
  }
  ]
}

The issue I have is that the Oracle date doesn't appear to be in a standard setting, can I extract this using Grok?

The above gives me the following (presumable because the date is in the wrong format):

java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Provided Grok expressions do not match field value: [Wed Aug 26 16:29:35 2015\nStarting background process CJQ0


#2

Just in case anyone stumbles upon this, I've managed to extract the date successfully. I'm no regexpert (haha!), and I'm sure there's a more graceful way, but it is working:

post _ingest/pipeline/_simulate
{
  "pipeline": {
  "description" : "grok test",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["%{ORADATE:ora-date}"],
        "pattern_definitions": {
          "ORADATE" :
          "^[A-Z]{1}[a-z]{2} [A-Z]{1}[a-z]{2} [0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{4}"
          
        }
      }
    },
      {
      "date": {
        "field": "ora-date",
        "formats": [ "EEE MMM dd HH:mm:ss YYYY" ]
      }
    }
  ]
},
"docs":[
  {
    "_source": {
      "message": "Wed Aug 26 16:29:35 2015\nStarting background process CJQ0"
    }
  }
  ]
}

(ruflin) #3

@dwjvaughan Glad you found a solution and thanks for sharing it with the community.


(Steffen Siering) #4

Great you got this working.

ingest node grok ships some default grok patterns. see github. Not sure how you could re-use these patterns or enhance by installing your own ones.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.