Help parsing custom nginx logs using Filebeat and Ingest Pipelines

Hi,

I am new to using ELK stack. I have custom logs for my nginx access.log files and I am needing help parsing them by using filebeat and ingest pipeline (Log Files -> Filebeat -> (Parse with Ingest Pipeline Parse) Elasticsearch).

Here are a couple sample log lines from my access.log to give you an idea of my custom log format:

192.168.0.1 - - [22/Dec/2023:02:54:23 +0000] "MGLNDD_192.168.0.1" 400 166 "-" "-" "-" "test.com" sn="test.com" rt=0.067 ua="-" us="-" ut="-" ul="-" cs=-

192.168.0.1 - - [22/Dec/2023:02:54:36 +0000] "GET /.env HTTP/1.1" 404 197 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"

192.168.0.1 - - [22/Dec/2023:02:54:37 +0000] "POST / HTTP/1.1" 405 568 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36"

192.168.0.1 - - [22/Dec/2023:14:58:13 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-

192.168.0.1 - - [22/Dec/2023:14:58:13 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-

192.168.0.1 - - [22/Dec/2023:14:58:14 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-

192.168.0.1 - - [22/Dec/2023:14:58:14 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-

Hi @BDeveloper I will try to take a look a bit later today.

But the first thing I want you to do is follow the Filebeat Quickstart pretty much exactly . The example is even with nginx

I followed the quick start, and the dashboards already get the initial information, etc.

I think you will find that will get us most of the way there ... then we can work on the custom parsing... perhaps I can take a look at that later today...

I will say that first line is pretty unusual "MGLNDD_192.168.0.1" . what field exactly do you want that to end up in?

Also, please provide (not in the ngnix config spec as I don't read them) , what are all the fields after the user agent are?

...Safari/537.36" THESE FROM HERE >>>> "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-

Hi @stephenb,

I followed the Filebeat Quickstart and have the dashboards getting the initial information, etc as mentioned above.

Here is the custom log format with the fields:

'$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'"$host" sn="$server_name" '
'rt=$request_time '
'ua="$upstream_addr" us="$upstream_status" '
'ut="$upstream_response_time" ul="$upstream_response_length" '
'cs=$upstream_cache_status'

As you can see in the access.log examples here:

192.168.0.1 - - [26/Dec/2023:19:10:23 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.2" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-
192.168.0.1 - - [26/Dec/2023:19:10:23 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.2" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-
192.168.0.1 - - [26/Dec/2023:19:10:24 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.2" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-

What are the types and units for these fields?

I see that "$host" maybe a an IP Address or Hostname if that in fact then it can only be treated as a keyword not an ip type as you can not mix types in the same field (or do a lot more processing)

Why are some surrounded by quotes and others are not ... specifically the numerics like

'ut="$upstream_response_time" ul="$upstream_response_length"

While
'rt=$request_time '

Is not?

Can you fill in a full example (several) or all extra fields always going to be "-"

Hi @stephenb,

"$host" is just a Hostname not an IP address.

As for 'rt=$request_time ' not being surrounded by quotes and 'ut="$upstream_response_time" ul="$upstream_response_length" are, the quotes are just for formatting the access.log, I think I just forgot to put them around the '"$request_time"'.

No not all of the extra fields are always going to be blank. The reason why they are on most of my examples is because a have a simple index.html file that I am hosting on a nginx web server for testing purposes with the hostname "test.com". Here are several other examples with more of the fields filled out so you can have a better idea of the fields.

192.168.0.1 - - [30/Oct/2023:13:17:22 +0000] "POST /test.com/home" 504 578 "https://test.com" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=59.869 ua="192.168.0.1:80" us="504" ut="60.545" ul="0" cs=-
192.168.0.1 - - [30/Oct/2023:13:17:22 +0000] "GET /secure/test.com/home" 504 578 "https://test.com/home" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=60.336 ua="192.168.0.1:80" us="504" ut="60.545" ul="0" cs=-
192.168.0.1 - - [30/Oct/2023:13:17:24 +0000] "GET /secure/test" 504 578 "https://test.com/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=60.163 ua="192.168.0.1:80" us="504" ut="60.545" ul="0" cs=-
192.168.0.1 - - [30/Oct/2023:13:17:24 +0000] "POST /secure/test HTTP/1.1" 504 578 "https://test.com/secure/home" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=60.428 ua="192.168.0.1:80" us="504" ut="60.545" ul="0" cs=-
192.168.0.1 - - [30/Oct/2023:13:17:25 +0000] "POST /secure/test HTTP/1.1" 504 578 "https://test.com/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=60.037 ua="192.168.0.1:80" us="504" ut="60.545" ul="0" cs=-
192.168.0.1 - - [30/Oct/2023:13:17:27 +0000] "POST /secure/test HTTP/1.1" 504 578 "https://test.com/secure/home" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=60.559 ua="192.168.0.1:80" us="504" ut="60.545" ul="0" cs=-
192.168.0.1 - - [17/Nov/2023:22:38:57 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-
192.168.0.1 - - [17/Nov/2023:22:38:57 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-
192.168.0.1 - - [17/Nov/2023:22:38:58 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-
192.168.0.1 - - [27/Dec/2023:15:30:45 +0000] "GET /test/home HTTP/1.1" 404 20148 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36" "-" "test.com" sn="test.com" rt=0.518 ua="192.168.0.1:443" us="404" ut="0.516" ul="20097" cs=-
192.168.0.1 - - [27/Dec/2023:15:37:04 +0000] "GET /test/home HTTP/1.1" 404 0 "-" "python-requests/2.31.0" "-" "test.com" sn="test.com" rt=0.001 ua="192.168.0.1:5000" us="404" ut="0.000" ul="0" cs=-
192.168.0.1 - - [27/Dec/2023:16:06:11 +0000] "GET /test/test.png HTTP/1.1" 404 5742 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" "-" "test.com" sn="test.com" rt=0.355 ua="192.168.0.1:443" us="404" ut="0.356" ul="5723" cs=-

Hi @BDeveloper

I have and update / solution for you. You may need to adjust etc... I picked some names for fields etc, there is no guarantee that every log line will parse / work.

We will start from where you left off with the Quickstart setting up the Filebeat and nginx module.

Then The Macro Steps are

  • Clone the Existing nginx Access Module Pipeline to become our custom pipeline
  • Add the new Grok Pattern Etc to support your custom format
  • Set the the modules to use our new custom pipeline
  • Run filebeat

Here are the files

The Pipeline Custom Pipeline

The Log File

The nginx.yml

In the next Post (perhaps tomorrow) I will show you how to quickly build / test ingest pipeline / groks etc.

These are the steps... .follow them very very closely. I am doing them through the UI but of course in reality I do it all through the API. with the links I put above

Clone The Existing Ingest Pipeline, Name It

Add the new Grok Pattern and
IMPORTANT Move it to the top so that it matches first (perhaps more on that later) Be Careful with the cut and past.
Save the Grok

(%{NGINX_HOST} )?"?(?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) - (-|%{DATA:user.name}) \[%{HTTPDATE:nginx.access.time}\] "%{DATA:nginx.access.info}" %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long} "(-|%{DATA:http.request.referrer})" "(-|%{DATA:user_agent.original})" "-" "(-|%{IPORHOST:nginx.access.host.name})" sn="(-|%{DATA:nginx.access.host.domain})" rt=(-|%{NUMBER:nginx.access.request_time:float}) ua="(-|%{DATA:nginx.access.upstream_addr})" us="(-|%{DATA:nginx.access.upstream_status})" ut="(-|%{NUMBER:nginx.access.upstream_response_time:float})" ul="(-|%{NUMBER:nginx.access.upstream_response_length:long})" cs=-

IMPORTANT Save the Pipeline

Now Modify the nginx.yml to use the new custom pipeline.

- module: nginx
  # Access logs
  access:
    enabled: true

    # Set the custom pipeline
    input.pipeline: filebeat-8.11.3-nginx-access-pipeline-custom

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/Users/sbrown/workspace/sample-data/discuss/discuss-custom-nginx.log"]

Start filebeat

Check Discover

And here is how I create / test ingest pipeline / grok

I create a simple test pipleline based on the original pipeline but just focus on the grok and _simulate it very quick / dev / test cycle

If you are going to do a lot of custom processing you should learn about grok and the patterns and some of the other tools to quickly build grok.

From here

Grok processor

Extracts structured fields out of a single text field within a document. You choose which field to extract matched fields from, as well as the grok pattern you expect will match. A grok pattern is like a regular expression that supports aliased expressions that can be reused.

This processor comes packaged with many reusable patterns.

If you need help building patterns to match your logs, you will find the Grok Debugger tool quite useful! The Grok Constructor is also a useful tool.

DELETE _ingest/pipeline/test-pipeline

PUT _ingest/pipeline/test-pipeline
{
  "description": "Pipeline for parsing Nginx access logs. Requires the geoip and user_agent plugins.",
  "processors": [
    {
      "set": {
        "field": "event.ingested",
        "value": "{{_ingest.timestamp}}"
      }
    },
    {
      "rename": {
        "field": "message",
        "target_field": "event.original"
      }
    },
    {
      "grok": {
        "pattern_definitions": {
          "NGINX_HOST": "(?:%{IP:destination.ip}|%{NGINX_NOTSEPARATOR:destination.domain})(:%{NUMBER:destination.port})?",
          "NGINX_NOTSEPARATOR": """[^	 ,:]+""",
          "NGINX_ADDRESS_LIST": """(?:%{IP}|%{WORD})("?,?\s*(?:%{IP}|%{WORD}))*"""
        },
        "ignore_missing": true,
        "field": "event.original",
        "patterns": [
         "(%{NGINX_HOST} )?\"?(?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) - (-|%{DATA:user.name}) \\[%{HTTPDATE:nginx.access.time}\\] \"%{DATA:nginx.access.info}\" %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long} \"(-|%{DATA:http.request.referrer})\" \"(-|%{DATA:user_agent.original})\" \"-\" \"(-|%{IPORHOST:nginx.access.host.name})\" sn=\"(-|%{DATA:nginx.access.host.domain})\" rt=(-|%{NUMBER:nginx.access.request_time:float}) ua=\"(-|%{DATA:nginx.access.upstream_addr})\" us=\"(-|%{DATA:nginx.access.upstream_status})\" ut=\"(-|%{NUMBER:nginx.access.upstream_response_time:float})\" ul=\"(-|%{NUMBER:nginx.access.upstream_response_length:long})\" cs=-",
         "(%{NGINX_HOST} )?\"?(?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) - (-|%{DATA:user.name}) \\[%{HTTPDATE:nginx.access.time}\\] \"%{DATA:nginx.access.info}\" %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long} \"(-|%{DATA:http.request.referrer})\" \"(-|%{DATA:user_agent.original})\" "

        ]
      }
    },
    {
      "grok": {
        "field": "nginx.access.info",
        "patterns": [
          "%{WORD:http.request.method} %{DATA:_tmp.url_orig} HTTP/%{NUMBER:http.version}",
          ""
        ],
        "ignore_missing": true
      }
    }
  ]
}


POST _ingest/pipeline/test-pipeline/_simulate
{
  "docs": [
    {
      "_source": {
        "message": """192.168.0.1 - - [17/Nov/2023:22:38:58 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-"""
      }
    }
  ]
}


POST _ingest/pipeline/test-pipeline/_simulate?verbose
{
  "docs": [
    {
      "_source": {
        "message": """192.168.0.1 - - [22/Dec/2023:02:54:23 +0000] "MGLNDD_192.168.0.1" 400 166 "-" "-" "-" "test.com" sn="test.com" rt=0.067 ua="-" us="-" ut="-" ul="-" cs=-"""
      }
    },
    {
      "_source": {
        "message": """192.168.0.2 - - [22/Dec/2023:14:58:14 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-"""
      }
    },
    {
      "_source": {
        "message": """192.168.0.3 - - [22/Dec/2023:14:58:14 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-"""
      }
    },
    {
      "_source": {
        "message": """192.168.0.1 - - [17/Nov/2023:22:38:58 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-"""
      }
    }
  ]
}

Try these you will see...

Hi @stephenb,

I followed the steps that you explained above and when I restarted filebeat and then checked the status, filebeat is now failing to start:

root@localhost:~# sudo systemctl restart filebeat
root@localhost:~# sudo systemctl status filebeat
× filebeat.service - Filebeat sends log files to Logstash or directly to Elasticsearch.
     Loaded: loaded (/lib/systemd/system/filebeat.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2023-12-28 16:46:24 UTC; 6s ago
       Docs: https://www.elastic.co/beats/filebeat
    Process: 476946 ExecStart=/usr/share/filebeat/bin/filebeat --environment systemd $BEAT_LOG_OPTS $BEAT_CONFIG_OPTS $BEAT_PATH_OPTS (code=exited, status=1/FAILURE)
   Main PID: 476946 (code=exited, status=1/FAILURE)
        CPU: 521ms

Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Main process exited, code=exited, status=1/FAILURE
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Failed with result 'exit-code'.
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Scheduled restart job, restart counter is at 5.
Dec 28 16:46:24 localhost systemd[1]: Stopped Filebeat sends log files to Logstash or directly to Elasticsearch..
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Start request repeated too quickly.
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Failed with result 'exit-code'.
Dec 28 16:46:24 localhost systemd[1]: Failed to start Filebeat sends log files to Logstash or directly to Elasticsearch..

I followed the steps from the Filebeat Quickstart and everything was working well.

I cloned the existing ingest pipeline and named it:

I then added the new Grok Pattern and moved it to the top so that it matches first and I saved it:

Then I created the pipeline.

Next I headed to the nginx.yml file and edited it. This is what it looks like now:

# Module: nginx
# Docs: https://www.elastic.co/guide/en/beats/filebeat/8.11/filebeat-module-nginx.html

- module: nginx
  # Access logs
  access:
    enabled: true

  #set custom pipeline
  input.pipeline: filebeat-8.11.2-nginx-access-pipeline-custom

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
   var.paths: ["/var/log/custom-nginx.log"]


  # Error logs
  error:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
   var.paths: ["/var/log/nginx/error.log*"]

  # Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
  ingress_controller:
    enabled: false

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

I then restarted filebeat and checked the status and now it is failing to start.

Thank you for working through this!

Check the filebeat logs...

/var/filebeat/*.log

@stephenb Here are the filebeat logs:

root@localhost:/var/log/filebeat# ls
filebeat-20231221-1.ndjson  filebeat-20231221-3.ndjson  filebeat-20231221-5.ndjson  filebeat-20231221-7.ndjson
filebeat-20231221-2.ndjson  filebeat-20231221-4.ndjson  filebeat-20231221-6.ndjson  filebeat-20231221-8.ndjson
root@localhost:/var/log/filebeat# sudo systemctl status filebeat
× filebeat.service - Filebeat sends log files to Logstash or directly to Elasticsearch.
     Loaded: loaded (/lib/systemd/system/filebeat.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Thu 2023-12-28 16:46:24 UTC; 29min ago
       Docs: https://www.elastic.co/beats/filebeat
    Process: 476946 ExecStart=/usr/share/filebeat/bin/filebeat --environment systemd $BEAT_LOG_OPTS $BEAT_CONFIG_OPTS $BEAT_PATH_OPTS (code=exited, status=1/FAILURE)
   Main PID: 476946 (code=exited, status=1/FAILURE)
        CPU: 521ms

Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Main process exited, code=exited, status=1/FAILURE
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Failed with result 'exit-code'.
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Scheduled restart job, restart counter is at 5.
Dec 28 16:46:24 localhost systemd[1]: Stopped Filebeat sends log files to Logstash or directly to Elasticsearch..
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Start request repeated too quickly.
Dec 28 16:46:24 localhost systemd[1]: filebeat.service: Failed with result 'exit-code'.
Dec 28 16:46:24 localhost systemd[1]: Failed to start Filebeat sends log files to Logstash or directly to Elasticsearch..
root@localhost:/var/log/filebeat# tail filebeat-20231221-8.ndjson
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.320Z","log.logger":"modules","log.origin":{"file.name":"fileset/pipelines.go","file.line":135},"message":"Elasticsearch pipeline loaded.","service.name":"filebeat","pipeline":"filebeat-8.11.2-panw-panos-threat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.323Z","log.logger":"modules","log.origin":{"file.name":"fileset/pipelines.go","file.line":135},"message":"Elasticsearch pipeline loaded.","service.name":"filebeat","pipeline":"filebeat-8.11.2-panw-panos-globalprotect","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.324Z","log.logger":"modules","log.origin":{"file.name":"fileset/pipelines.go","file.line":135},"message":"Elasticsearch pipeline loaded.","service.name":"filebeat","pipeline":"filebeat-8.11.2-panw-panos-userid","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.324Z","log.logger":"modules","log.origin":{"file.name":"fileset/pipelines.go","file.line":135},"message":"Elasticsearch pipeline loaded.","service.name":"filebeat","pipeline":"filebeat-8.11.2-panw-panos-hipmatch","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.325Z","log.logger":"modules","log.origin":{"file.name":"fileset/modules.go","file.line":135},"message":"Enabled modules/filesets: pensando (dfw)","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.325Z","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":122},"message":"elasticsearch url: http://66.228.44.236:9200","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.327Z","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":304},"message":"Attempting to connect to Elasticsearch version 8.11.2 (default)","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.329Z","log.logger":"modules","log.origin":{"file.name":"fileset/pipelines.go","file.line":135},"message":"Elasticsearch pipeline loaded.","service.name":"filebeat","pipeline":"filebeat-8.11.2-pensando-dfw-pipeline","ecs.version":"1.6.0"}
{"log.level":"error","@timestamp":"2023-12-21T19:15:21.329Z","log.origin":{"file.name":"cfgfile/reload.go","file.line":255},"message":"Error loading configuration files: 1 error: Unable to hash given config: missing field accessing '0.audit' (source:'/etc/filebeat/modules.d/gcp.yml.disabled')","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-12-21T19:15:21.329Z","log.logger":"load","log.origin":{"file.name":"cfgfile/list.go","file.line":188},"message":"Stopping 68 runners ...","service.name":"filebeat","ecs.version":"1.6.0"}
root@localhost:/var/log/filebeat#

Something else going on....

Look in your modules.d directory...

Do you have any stray files... do not leave any unaccounted files that end in .yml

Do you have anything else in your filebeat.yml file...

This looks unrelated to the nginx stuff...

Or This is not indented correctly. I don't think it is It is not lined up

@stephenb The only modules.d directory files that I have enabled are nginx.yml and system.yml.

I fixed the indentation in my nginx.yml file. It now looks like this:

# Module: nginx
# Docs: https://www.elastic.co/guide/en/beats/filebeat/8.11/filebeat-module-nginx.html

- module: nginx
  # Access logs
  access:
    enabled: true

  #set custom pipeline
  input.pipeline: filebeat-8.11.2-nginx-access-pipeline-custom

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/var/log/custom-nginx.log"]


  # Error logs
  error:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/var/log/nginx/error.log*"]

  # Ingress-nginx controller logs. This is disabled by default. It could be used in Kubernetes environments to parse ingress-nginx logs
  ingress_controller:
    enabled: false

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

Here is what my filebeat.yml file looks like:

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

# ============================== Filebeat inputs ===============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input-specific configurations.

# filestream is an input for collecting log messages from files.
- type: filestream
  # Unique ID among all inputs, an ID is required.
  id: my-filestream-id
  # Change to true to enable this input configuration.
  enabled: true
  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log
    - /var/log/nginx/*.log
 # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  # Line filtering happens after the parsers pipeline. If you would like to filter lines
  # before parsers, use include_message parser.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  # Line filtering happens after the parsers pipeline. If you would like to filter lines
  # before parsers, use include_message parser.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #prospector.scanner.exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

# ============================== Filebeat modules ==============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

# ================================== General ===================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboard archive. By default, this URL
# has a value that is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

# =================================== Kibana ===================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  host: "192.168.0.1:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

# =============================== Elastic Cloud ================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

# ================================== Outputs ===================================

# Configure what output to use when sending the data collected by the beat.

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  # Array of hosts to connect to.
 hosts: ["192.168.0.1:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  #username: "elastic"
  #password: "changeme"

# ------------------------------ Logstash Output -------------------------------
#output.logstash:
  # The Logstash hosts
 # hosts: ["192.168.0.1:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

I restarted filebeat again and it is still failing to start.

Needs to be lined up under the input. The same line is all the first level settings.

So it's indented too far. This is yml stuff you need to be very aware If you look at the copy I provided, you'll see it's indented correctly. It should be.

Yep, I had a indentation errot in the post above which I fixed but the actual file I checked in provided is correct. You got to get the indentation sorry that was my fault on the long post

@stephenb That was the problem, I fixed the indentation and now filebeat is running.

Now I am having one last problem. When I go to discover I am not seeing all of my nginx.access.* available fields. I am only seeing one field:

Also I am seeing this error message that says field [nginx] nto present as part of path [nginx.access.time].

Which I think is why I am not seeing all of my fields.

Thank you for the help!

In the ingest pipeline did you move the new grok to the top / first pattern? If not that could be the cause...

You can use the code from the file and PUT in the ingest pipeline through the Dev Tools

Did you try the test file?

Did that work?

Are you seeing the correct number of events?

Is that error on every message? Or just a couple

What does the event.original contents for that one with the message ... Your going to need to dig in deep...

Please provide a sample log or event.original for a document with that error

No a screen shot... Screen shot are hard to read and can not be easily debugged

I also see you enable the nginx error fileset I / we have not done anything with that

@stephenb yes I moved the new grok to the top pattern.

Yes, there is that same error message on every message. Here is what it looks like:

{
  "@timestamp": [
    "2023-12-29T16:51:38.603Z"
  ],
  "agent.ephemeral_id": [
    "0561886a-0b17-47fa-b204-f333c7117b0e"
  ],
  "agent.hostname": [
    "localhost"
  ],
  "agent.id": [
    "01b55c06-e36e-45d8-99fe-38eed50f1c0b"
  ],
  "agent.name": [
    "localhost"
  ],
  "agent.type": [
    "filebeat"
  ],
  "agent.version": [
    "8.11.2"
  ],
  "ecs.version": [
    "1.12.0"
  ],
  "error.message": [
    "field [nginx] not present as part of path [nginx.access.time]"
  ],
  "event.created": [
    "2023-12-29T16:51:38.603Z"
  ],
  "event.dataset": [
    "nginx.access"
  ],
  "event.ingested": [
    "2023-12-29T16:51:39.605Z"
  ],
  "event.module": [
    "nginx"
  ],
  "event.original": [
    "192.168.0.1 - - [29/Dec/2023:16:51:38 +0000] \"GET / HTTP/1.1\" 304 0 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36\" \"-\" \"192.168.0.1\" sn=\"test.com\" rt=0.000 ua=\"-\" us=\"-\" ut=\"-\" ul=\"-\" cs=-"
  ],
  "event.timezone": [
    "+00:00"
  ],
  "fileset.name": [
    "access"
  ],
  "host.architecture": [
    "x86_64"
  ],
  "host.containerized": [
    false
  ],
  "host.hostname": [
    "localhost"
  ],
  "host.id": [
    "864c2897c3f947e58f2e627f75857003"
  ],
  "host.ip": [
    "192.168.0.1",
    "2600:3c03::f03c:94ff:fe2c:258e",
    "fe80::f03c:94ff:fe2c:258e"
  ],
  "host.mac": [
    "F2-3C-65-2C-25-12E"
  ],
  "host.name": [
    "localhost"
  ],
  "host.os.codename": [
    "jammy"
  ],
  "host.os.family": [
    "debian"
  ],
  "host.os.kernel": [
    "5.15.0-83-generic"
  ],
  "host.os.name": [
    "Ubuntu"
  ],
  "host.os.name.text": [
    "Ubuntu"
  ],
  "host.os.platform": [
    "ubuntu"
  ],
  "host.os.type": [
    "linux"
  ],
  "host.os.version": [
    "22.04.3 LTS (Jammy Jellyfish)"
  ],
  "input.type": [
    "log"
  ],
  "log.file.path": [
    "/var/log/nginx/access.log"
  ],
  "log.offset": [
    64683
  ],
  "service.type": [
    "nginx"
  ],
  "source.address": [
    ""
  ],
  "_id": "sjx9towBpeQAX3u1E0zV",
  "_index": ".ds-filebeat-8.11.2-2023.12.08-000001",
  "_score": null
}

I am not able to see the correct number of fields or events.

When I go to my filebeat-8.11.2-nginx-access-pipeline-custom and look at the processors this is what it looks like:

[
  {
    "grok": {
      "field": "event.original",
      "patterns": [
        "(%{NGINX_HOST} )?\"?(?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) - (-|%{DATA:user.name}) \\[%{HTTPDATE:nginx.access.time}\\] \"%{DATA:nginx.access.info}\" %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long} \"(-|%{DATA:http.request.referrer})\" \"(-|%{DATA:user_agent.original})\" \"-\" \"(-|%{IPORHOST:nginx.access.host.name})\" sn=\"(-|%{DATA:nginx.access.host.domain})\" rt=(-|%{NUMBER:nginx.access.request_time:float}) ua=\"(-|%{DATA:nginx.access.upstream_addr})\" us=\"(-|%{DATA:nginx.access.upstream_status})\" ut=\"(-|%{NUMBER:nginx.access.upstream_response_time:float})\" ul=\"(-|%{NUMBER:nginx.access.upstream_response_length:long})\" cs=-",
        "(%{NGINX_HOST} )?\"?(?:%{NGINX_ADDRESS_LIST:nginx.access.remote_ip_list}|%{NOTSPACE:source.address}) - (-|%{DATA:user.name}) \\[%{HTTPDATE:nginx.access.time}\\] \"%{DATA:nginx.access.info}\" %{NUMBER:http.response.status_code:long} %{NUMBER:http.response.body.bytes:long} \"(-|%{DATA:http.request.referrer})\" \"(-|%{DATA:user_agent.original})\""
      ],
      "pattern_definitions": {
        "NGINX_NOTSEPARATOR": "[^\t ,:]+",
        "NGINX_ADDRESS_LIST": "(?:%{IP}|%{WORD})(\"?,?\\s*(?:%{IP}|%{WORD}))*",
        "NGINX_HOST": "(?:%{IP:destination.ip}|%{NGINX_NOTSEPARATOR:destination.domain})(:%{NUMBER:destination.port})?"
      },
      "ignore_missing": true
    }
  },
  {
    "set": {
      "field": "event.ingested",
      "value": "{{_ingest.timestamp}}"
    }
  },
  {
    "rename": {
      "field": "message",
      "target_field": "event.original"
    }
  },
  {
    "grok": {
      "field": "nginx.access.info",
      "patterns": [
        "%{WORD:http.request.method} %{DATA:_tmp.url_orig} HTTP/%{NUMBER:http.version}",
        ""
      ],
      "ignore_missing": true
    }
  },
  {
    "uri_parts": {
      "field": "_tmp.url_orig",
      "ignore_failure": true
    }
  },
  {
    "set": {
      "field": "url.domain",
      "value": "{{destination.domain}}",
      "if": "ctx.url?.domain == null && ctx.destination?.domain != null"
    }
  },
  {
    "remove": {
      "field": [
        "nginx.access.info",
        "_tmp.url_orig"
      ],
      "ignore_missing": true
    }
  },
  {
    "split": {
      "separator": "\"?,?\\s+",
      "ignore_missing": true,
      "field": "nginx.access.remote_ip_list"
    }
  },
  {
    "split": {
      "ignore_missing": true,
      "field": "nginx.access.origin",
      "separator": "\"?,?\\s+"
    }
  },
  {
    "set": {
      "field": "source.address",
      "if": "ctx.source?.address == null",
      "value": ""
    }
  },
  {
    "script": {
      "source": "boolean isPrivate(def dot, def ip) {\n  try {\n    StringTokenizer tok = new StringTokenizer(ip, dot);\n    int firstByte = Integer.parseInt(tok.nextToken());\n    int secondByte = Integer.parseInt(tok.nextToken());\n    if (firstByte == 10) {\n      return true;\n    }\n    if (firstByte == 192 && secondByte == 168) {\n      return true;\n    }\n    if (firstByte == 172 && secondByte >= 16 && secondByte <= 31) {\n      return true;\n    }\n    if (firstByte == 127) {\n      return true;\n    }\n    return false;\n  }\n  catch (Exception e) {\n    return false;\n  }\n} try {\n  ctx.source.address = null;\n  if (ctx.nginx.access.remote_ip_list == null) {\n    return;\n  }\n  def found = false;\n  for (def item : ctx.nginx.access.remote_ip_list) {\n    if (!isPrivate(params.dot, item)) {\n      ctx.source.address = item;\n      found = true;\n      break;\n    }\n  }\n  if (!found) {\n    ctx.source.address = ctx.nginx.access.remote_ip_list[0];\n  }\n} catch (Exception e) {\n  ctx.source.address = null;\n}",
      "params": {
        "dot": "."
      },
      "if": "ctx.nginx?.access?.remote_ip_list != null && ctx.nginx.access.remote_ip_list.length > 0",
      "lang": "painless"
    }
  },
  {
    "remove": {
      "field": "source.address",
      "if": "ctx.source.address == null"
    }
  },
  {
    "grok": {
      "ignore_failure": true,
      "field": "source.address",
      "patterns": [
        "^%{IP:source.ip}$"
      ]
    }
  },
  {
    "set": {
      "copy_from": "@timestamp",
      "field": "event.created"
    }
  },
  {
    "date": {
      "field": "nginx.access.time",
      "target_field": "@timestamp",
      "formats": [
        "dd/MMM/yyyy:H:m:s Z"
      ],
      "on_failure": [
        {
          "append": {
            "value": "{{ _ingest.on_failure_message }}",
            "field": "error.message"
          }
        }
      ]
    }
  },
  {
    "remove": {
      "field": "nginx.access.time"
    }
  },
  {
    "user_agent": {
      "ignore_missing": true,
      "field": "user_agent.original"
    }
  },
  {
    "geoip": {
      "field": "source.ip",
      "target_field": "source.geo",
      "ignore_missing": true
    }
  },
  {
    "geoip": {
      "database_file": "GeoLite2-ASN.mmdb",
      "field": "source.ip",
      "target_field": "source.as",
      "properties": [
        "asn",
        "organization_name"
      ],
      "ignore_missing": true
    }
  },
  {
    "rename": {
      "field": "source.as.asn",
      "target_field": "source.as.number",
      "ignore_missing": true
    }
  },
  {
    "rename": {
      "target_field": "source.as.organization.name",
      "ignore_missing": true,
      "field": "source.as.organization_name"
    }
  },
  {
    "set": {
      "field": "event.kind",
      "value": "event"
    }
  },
  {
    "append": {
      "value": "web",
      "field": "event.category"
    }
  },
  {
    "append": {
      "field": "event.type",
      "value": "access"
    }
  },
  {
    "set": {
      "value": "success",
      "if": "ctx?.http?.response?.status_code != null && ctx.http.response.status_code < 400",
      "field": "event.outcome"
    }
  },
  {
    "set": {
      "if": "ctx?.http?.response?.status_code != null && ctx.http.response.status_code >= 400",
      "field": "event.outcome",
      "value": "failure"
    }
  },
  {
    "append": {
      "field": "related.ip",
      "value": "{{source.ip}}",
      "if": "ctx?.source?.ip != null"
    }
  },
  {
    "append": {
      "field": "related.ip",
      "value": "{{destination.ip}}",
      "if": "ctx?.destination?.ip != null"
    }
  },
  {
    "append": {
      "field": "related.user",
      "value": "{{user.name}}",
      "if": "ctx?.user?.name != null"
    }
  },
  {
    "script": {
      "lang": "painless",
      "description": "This script processor iterates over the whole document to remove fields with null values.",
      "source": "void handleMap(Map map) {\n  for (def x : map.values()) {\n    if (x instanceof Map) {\n        handleMap(x);\n    } else if (x instanceof List) {\n        handleList(x);\n    }\n  }\n  map.values().removeIf(v -> v == null);\n}\nvoid handleList(List list) {\n  for (def x : list) {\n      if (x instanceof Map) {\n          handleMap(x);\n      } else if (x instanceof List) {\n          handleList(x);\n      }\n  }\n}\nhandleMap(ctx);\n"
    }
  }
]

Failure Processors:

[
  {
    "set": {
      "value": "{{ _ingest.on_failure_message }}",
      "field": "error.message"
    }
  }
]

Could it have something to do with my nginx log format?

I have not used Dev Tools before, how would I go about using the test file?

As for having the nginx error fileset enabled I went ahead and set it to false, but that was not causing the error.

Yes, something odd is going on will need to take a look later...

I can see the issue / bug ... I see the error with your pipeline but not with mine.

Sorry, busy day. I may not get back to this at all today...

Can you try to go to the code I showed here

Paste it carefully into Kibana - Dev Tools

Run the command, that will PUT my pipeline into elasticsearch.

Then set the my pipeline name in nginx.yml don't worry about 8.11.3 it wont matter.

Then try again...

Kibana Dev Tool

Your does not work (if you pasted the entire pipeline)

POST _ingest/pipeline/filebeat-8.11.2-nginx-access-pipeline-custom/_simulate
{
  "docs": [
    {
      "_source": {
        "@timestamp":"2023-12-29T18:19:51.218Z",
        "message": """192.168.0.1 - - [29/Dec/2023:16:51:38 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-"""
      }
    },
    {
      "_source": {
                "@timestamp":"2023-12-29T18:19:51.218Z",
        "message": """192.168.0.12 - - [17/Nov/2023:22:38:57 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-"""
      }
    }
    ]
}

# Result

{
  "docs": [
    {
      "error": {
        "root_cause": [
          {
            "type": "illegal_argument_exception",
            "reason": "field [nginx] not present as part of path [nginx.access.time]"
          }
        ],
        "type": "illegal_argument_exception",
        "reason": "field [nginx] not present as part of path [nginx.access.time]"
      }
    },
    {
      "error": {
        "root_cause": [
          {
            "type": "illegal_argument_exception",
            "reason": "field [nginx] not present as part of path [nginx.access.time]"
          }
        ],
        "type": "illegal_argument_exception",
        "reason": "field [nginx] not present as part of path [nginx.access.time]"
      }
    }
  ]
}

Mine with same works

POST _ingest/pipeline/filebeat-8.11.3-nginx-access-pipeline-custom/_simulate
{
  "docs": [
    {
      "_source": {
        "@timestamp":"2023-12-29T18:19:51.218Z",
        "message": """192.168.0.1 - - [29/Dec/2023:16:51:38 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-"""
      }
    },
    {
      "_source": {
                "@timestamp":"2023-12-29T18:19:51.218Z",
        "message": """192.168.0.12 - - [17/Nov/2023:22:38:57 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-"""
      }
    }
    ]
}

# Results

{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "@timestamp": "2023-12-29T16:51:38.000Z",
          "nginx": {
            "access": {
              "host": {
                "name": "192.168.0.1",
                "domain": "test.com"
              },
              "request_time": 0,
              "remote_ip_list": [
                "192.168.0.1"
              ]
            }
          },
          "_tmp": {},
          "related": {
            "ip": [
              "192.168.0.1"
            ]
          },
          "http": {
            "request": {
              "method": "GET"
            },
            "version": "1.1",
            "response": {
              "body": {
                "bytes": 0
              },
              "status_code": 304
            }
          },
          "source": {
            "address": "192.168.0.1",
            "ip": "192.168.0.1"
          },
          "event": {
            "ingested": "2023-12-29T18:25:42.443974940Z",
            "original": """192.168.0.1 - - [29/Dec/2023:16:51:38 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-""",
            "created": "2023-12-29T18:19:51.218Z",
            "kind": "event",
            "category": [
              "web"
            ],
            "type": [
              "access"
            ],
            "outcome": "success"
          },
          "user_agent": {
            "name": "Chrome",
            "original": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
            "os": {
              "name": "Mac OS X",
              "version": "10.15.7",
              "full": "Mac OS X 10.15.7"
            },
            "device": {
              "name": "Mac"
            },
            "version": "120.0.0.0"
          },
          "url": {
            "path": "/",
            "original": "/"
          }
        },
        "_ingest": {
          "timestamp": "2023-12-29T18:25:42.44397494Z"
        }
      }
    },
    {
      "doc": {
        "_index": "_index",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "@timestamp": "2023-11-17T22:38:57.000Z",
          "nginx": {
            "access": {
              "upstream_status": "500",
              "request_time": 0.003,
              "upstream_addr": "192.168.0.1:80",
              "upstream_response_length": 256,
              "host": {
                "name": "test.com",
                "domain": "test.com"
              },
              "upstream_response_time": 0,
              "remote_ip_list": [
                "192.168.0.12"
              ]
            }
          },
          "_tmp": {},
          "related": {
            "ip": [
              "192.168.0.12"
            ]
          },
          "http": {
            "request": {
              "method": "POST",
              "referrer": "https://test.com/home/secure/"
            },
            "version": "1.1",
            "response": {
              "body": {
                "bytes": 256
              },
              "status_code": 500
            }
          },
          "source": {
            "address": "192.168.0.12",
            "ip": "192.168.0.12"
          },
          "event": {
            "ingested": "2023-12-29T18:25:42.443995170Z",
            "original": """192.168.0.12 - - [17/Nov/2023:22:38:57 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-""",
            "created": "2023-12-29T18:19:51.218Z",
            "kind": "event",
            "category": [
              "web"
            ],
            "type": [
              "access"
            ],
            "outcome": "failure"
          },
          "user_agent": {
            "name": "Chrome",
            "original": "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36",
            "os": {
              "name": "Chrome OS",
              "version": "14541.0.0",
              "full": "Chrome OS 14541.0.0"
            },
            "device": {
              "name": "Other"
            },
            "version": "116.0.0.0"
          },
          "url": {
            "path": "/home/secure/test",
            "original": "/home/secure/test"
          }
        },
        "_ingest": {
          "timestamp": "2023-12-29T18:25:42.44399517Z"
        }
      }
    }
  ]
}

@stephenb I am not really sure what the problem was but I followed your steps and copied your code and pasted it in the kibana dev tools to put your pipeline into elasticsearch.

I then edited my nginx.yml then restarted filebeat and now I am getting all of my fields as expected when I test it.

{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "@timestamp": "2023-12-29T16:51:38.000Z",
          "nginx": {
            "access": {
              "host": {
                "name": "192.168.0.1",
                "domain": "test.com"
              },
              "request_time": 0,
              "remote_ip_list": [
                "192.168.0.1"
              ]
            }
          },
          "_tmp": {},
          "related": {
            "ip": [
              "192.168.0.1"
            ]
          },
          "http": {
            "request": {
              "method": "GET"
            },
            "version": "1.1",
            "response": {
              "body": {
                "bytes": 0
              },
              "status_code": 304
            }
          },
          "source": {
            "address": "192.168.0.1",
            "ip": "192.168.0.1"
          },
          "event": {
            "ingested": "2023-12-29T19:06:52.039365664Z",
            "original": """192.168.0.1 - - [29/Dec/2023:16:51:38 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "-" "192.168.0.1" sn="test.com" rt=0.000 ua="-" us="-" ut="-" ul="-" cs=-""",
            "created": "2023-12-29T18:19:51.218Z",
            "kind": "event",
            "category": [
              "web"
            ],
            "type": [
              "access"
            ],
            "outcome": "success"
          },
          "user_agent": {
            "name": "Chrome",
            "original": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
            "os": {
              "name": "Mac OS X",
              "version": "10.15.7",
              "full": "Mac OS X 10.15.7"
            },
            "device": {
              "name": "Mac"
            },
            "version": "120.0.0.0"
          },
          "url": {
            "path": "/",
            "original": "/"
          }
        },
        "_ingest": {
          "timestamp": "2023-12-29T19:06:52.039365664Z"
        }
      }
    },
    {
      "doc": {
        "_index": "_index",
        "_version": "-3",
        "_id": "_id",
        "_source": {
          "@timestamp": "2023-11-17T22:38:57.000Z",
          "nginx": {
            "access": {
              "upstream_status": "500",
              "request_time": 0.003,
              "upstream_addr": "192.168.0.1:80",
              "upstream_response_length": 256,
              "host": {
                "name": "test.com",
                "domain": "test.com"
              },
              "upstream_response_time": 0,
              "remote_ip_list": [
                "192.168.0.12"
              ]
            }
          },
          "_tmp": {},
          "related": {
            "ip": [
              "192.168.0.12"
            ]
          },
          "http": {
            "request": {
              "method": "POST",
              "referrer": "https://test.com/home/secure/"
            },
            "version": "1.1",
            "response": {
              "body": {
                "bytes": 256
              },
              "status_code": 500
            }
          },
          "source": {
            "address": "192.168.0.12",
            "ip": "192.168.0.12"
          },
          "event": {
            "ingested": "2023-12-29T19:06:52.039382244Z",
            "original": """192.168.0.12 - - [17/Nov/2023:22:38:57 +0000] "POST /home/secure/test HTTP/1.1" 500 256 "https://test.com/home/secure/" "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36" "-" "test.com" sn="test.com" rt=0.003 ua="192.168.0.1:80" us="500" ut="0.000" ul="256" cs=-""",
            "created": "2023-12-29T18:19:51.218Z",
            "kind": "event",
            "category": [
              "web"
            ],
            "type": [
              "access"
            ],
            "outcome": "failure"
          },
          "user_agent": {
            "name": "Chrome",
            "original": "Mozilla/5.0 (X11; CrOS x86_64 14541.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36",
            "os": {
              "name": "Chrome OS",
              "version": "14541.0.0",
              "full": "Chrome OS 14541.0.0"
            },
            "device": {
              "name": "Other"
            },
            "version": "116.0.0.0"
          },
          "url": {
            "path": "/home/secure/test",
            "original": "/home/secure/test"
          }
        },
        "_ingest": {
          "timestamp": "2023-12-29T19:06:52.039382244Z"
        }
      }
    }
  ]
}

Thank you so much for your help!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.