NGINX module grok error

Hi,
I'm trying to use the NGINX module on the output of my docker logs.
So far I managed to get as output:

{ 
  "@timestamp": "2017-05-17T13:12:48.507Z", 
  "beat": {
    "hostname": "86ba9026f4b1",
    "name": "86ba9026f4b1",
    "version": "5.4.0"
  },
  "input_type": "log",
  "log": "0.0.0.0- - [17/May/2017:13:12:43 +0000] \"GET /test HTTP/1.1\" 304 0 \"http://toto.com/\" \"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0\"",
  "offset": 3146,
  "source": "/var/log/5558ee47831ba97a85166c64e4e5fd6b1afd7dcff62bf546e93f82b99ff43959-json.log",
  "stream": "stdout",
  "time": "2017-05-17T13:12:43.422536215Z",
  "type": "nginx_access"
}

But I keep getting the following message in Kibana:

{
  "@timestamp": "2017-05-17T13:12:48.507Z",
  "beat": {
    "hostname": "86ba9026f4b1",
    "name": "86ba9026f4b1",
    "version": "5.4.0"
  },
  "error": "field [message] not present as part of path [message]",
  "input_type": "log",
  "log": "0.0.0.0- - [17/May/2017:13:12:43 +0000] \"GET /test HTTP/1.1\" 304 0 \"http://toto.com/\" \"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0\"",
  "offset": 3146,
  "source": "/var/log/5558ee47831ba97a85166c64e4e5fd6b1afd7dcff62bf546e93f82b99ff43959-json.log",
  "stream": "stdout",
  "time": "2017-05-17T13:12:43.422536215Z",
  "type": "nginx_access"
}

I don't know why it keeps trying to use the message fields, however here is my grok config (in /module/nginx/access/ingest/default.json):

{
    "grok": {
      "field": "log",
      "trace_match": true,
      "patterns":[
        "%{IPORHOST:nginx.access.remote_ip} - %{DATA:nginx.access.user_name} \\[%{HTTPDATE:nginx.access.time}\\] \"%{WORD:nginx.access.method} %{DATA:nginx.access.url} HTTP/%{NUMBER:nginx.access.http_version}\"
        ],
      "ignore_missing": true
    }
  }

(Note field: log)

Any idea why it is still looking for this field ?

How did you setup the module? Which exact version of filebeat are you using? Can you also share your filebeat config?

Here is the config:

Filebeat version: 5.4.0

filebeat.yml

filebeat.modules:
	- module: nginx
	  access:
	    enabled: true
	    var.paths:
	      - /var/log/*.log
	    prospector:
	      document_type: nginx_access
	      json.message_key: "log"
	      json.keys_under_root: true

filebeat.prospectors:

output.elasticsearch:
  hosts: ["elasticsearch:9200"]

/module/nginx/access/ingest/default.json

{
  "description": "Pipeline for parsing Nginx access logs. Requires the geoip and user_agent plugins.",
  "processors": [
{
    "grok": {
      "field": "log",
      "trace_match": true,
      "patterns":[
        "%{IPORHOST:nginx.access.remote_ip} - %{DATA:nginx.access.user_name} \\[%{HTTPDATE:nginx.access.time}\\] \"%{WORD:nginx.access.method} %{DATA:nginx.access.url} HTTP/%{NUMBER:nginx.access.http_version}\" %{NUMBER:nginx.access.response_code} %{NUMBER:nginx.access.body_sent.bytes} \"%{DATA:nginx.access.referrer}\" \"%{DATA:nginx.access.agent}\""
        ],
      "ignore_missing": true
    }
  },{
    "remove":{
      "field": "log"
    }
  }, {
    "rename": {
      "field": "@timestamp",
      "target_field": "read_timestamp"
    }
  }, {
    "date": {
      "field": "nginx.access.time",
      "target_field": "@timestamp",
      "formats": ["dd/MMM/YYYY:H:m:s Z"]
    }
  }, {
    "remove": {
      "field": "nginx.access.time"
    }
  }, {
    "user_agent": {
      "field": "nginx.access.agent",
      "target_field": "nginx.access.user_agent"
    }
  }, {
    "remove": {
      "field": "nginx.access.agent"
    }
  }, {
    "geoip": {
      "field": "nginx.access.remote_ip",
      "target_field": "nginx.access.geoip"
    }
  }],
  "on_failure" : [{
    "set" : {
      "field" : "error.message",
      "value" : "{{ _ingest.on_failure_message }}"
    }
  }]
}

manifest.yml

module_version: "1.0"

ingest_pipeline: ingest/default.json
prospector: config/nginx-access.yml

requires.processors:
- name: user_agent
  plugin: ingest-user-agent
- name: geoip
  plugin: ingest-geoip
1 Like

Ok, I see you modified the nginx module to use the json processor because docker outputs it as json. Could you share a few log lines that you get from docker?

One more note: Did you remove the "old" ingest processor before you loaded your own one?

Here are some log outputs from docker:

{"log":"0.0.0.0 - - [19/May/2017:08:12:57 +0000] \"GET /v1/bots HTTP/1.1\" 304 0 \"http://localhost:8081/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\"\n","stream":"stdout","time":"2017-05-19T08:12:57.073668657Z"}
{"log":"0.0.0.0 - - [19/May/2017:08:12:57 +0000] \"GET /v1/user/receipts HTTP/1.1\" 304 0 \"http://localhost:8081/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36\"\n","stream":"stdout","time":"2017-05-19T08:12:57.241466012Z"}

By old ingest processor are you referring to the default.json in /module/nginx/access/ingest ? If yes, the file is overwritten when the filebeat docker runs.

Ok, if it is overwritten, it should be fine. If it takes the pipeline you configured above, also not sure why there is still a reference to message as this does not seem to pop up in your pipeline definition. Perhaps you could try the simulate API with your document? https://www.elastic.co/guide/en/elasticsearch/reference/master/simulate-pipeline-api.html

It works when I try it with the pipeline API.
It seems that the NGINX module does not take into account the new default default.json, even if I log to the docker, modify the document and launch it again. I have no idea why though.
Maybe a bug for this module in filebeat 5.4.0 ?

You mentioned in the beginning that you are overwriting the pipeline. How do you do that?

I run the docker cmd with -v /home/localadmin/test/default.json:/module/nginx/access/ingest/default.json. The file is indeed overwritten, since when I check by running in interactive mode I can see my new file.

Oh, I meant the ingest pipeline in elasticsearch itself. Because it's important that this is the one that gets updated.

You mean that I need to manually add the pipeline to Elasticsearch ?

Or remove the old one and on startup the new one is added automatically.

Ok so the old one was not overwritten by the new config. Thanks !

Yes, the pipelines are not updated if they are the same version. You can delete them by running something like this in the Console: DELETE _ingest/pipeline/filebeat-*-nginx*

1 Like

It's what I did and it works (partially since the mapping does not seems to be correct but that's an other issue).

This topic was automatically closed after 21 days. New replies are no longer allowed.