Grok pattern to insert log files into Elasticsearch through Filebeat


(dimitri) #1

Hi everybody, i'm new in the elasticsearch community and I would like your help on something I'm struggeling with.
My goal is to send huge quantity of log files to Elasticsearch using Filebeat.
In order to do that I need to parse data using ingest nodes with Grok pattern processor. Without doing that, all my logs are not exploitable as each like fall in the same "message" field. Unfortunately I have some issues with the grok regex and I can't find the problem as It's the first time I work with that.
My logs look like that:

2016-09-01T10:58:41+02:00 INFO (6): 	165.225.76.76	entreprise1	email1@gmail.com	POST	/application/controller/action	Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko	{"getid":"1"}	86rkt2dqsdze5if1bqldfl1
2016-09-01T10:58:41+02:00 INFO (6): 	165.225.76.76	entreprise2	email2@gmail.com	POST	/application/controller/action	Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko	{"getid":"2"}	86rkt2rgdgdfgdfgeqldfl1
2016-09-01T10:58:41+02:00 INFO (6): 	165.225.76.76	entreprise3	email3@gmail.com	POST	/application/controller/action	Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko	{"getid":"2"}

So we have tabs as separator, and those fields:
date, ip, company_name, email, method(post,get), url, browser, json_request, optional_code

My ingest pipeline json looks like that:

PUT _ingest/pipeline/my_elastic_index

    {
      "description" : "Convert logs txt files",
      "processors" : [
        {
          "grok": {
            "field": "message",
            "patterns": ["**%{TIMESTAMP_ISO8601:timestamp} %{IP:ip} %{WORD:company}% {EMAILADDRESS:email} %{WORD:method} %{URIPATH:page} %{WORD:browser} %{WORD:code}**"]

          }
        },
        {
          "date" : {
            "field" : "timestamp",
            "formats" : ["yyyy-MM-ddTHH:mm:ss INFO(6):"]
          }
        }
      ],
      "on_failure" : [
        {
          "set" : {
            "field" : "error",
            "value" : " - Error processing message - "
          }
        }
      ]
    }

This does not work.

  1. How can I espace character ? For example "INFO (6):" at the end of timestamp
  2. Can I just use space between field ?
  3. The code at the end of lines is not always present in logs, can this be a problem ?
  4. Do you have ideas why this configuration doesnt parse in anyway my logs document under elasticsearch ?

Thanks a lot for your help and Excuse my english I'm french.


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.