Pipeline Not Working Filebeats -> Elasticsearch (5.0.0 Alpha 3)

I'm parsing IIS logs and am unable to get the 'message' field to parse in Elasticsearch. This is my filebeat.yml file (relevant lines)

    output:
    ### Elasticsearch as output
    elasticsearch:
    hosts: ["elk-09:9200", "elk-10:9200"]
    parameters: {pipeline: filebeat_pipeline}

This is my pipeline taken from elk-01:9200/_ingest/pipeline/filebeat_pipeline?pretty

{
  "pipelines" : [ {
    "id" : "filebeat_pipeline",
    "config" : {
      "description" : "grok iis log messages",
      "processors" : [ {
        "grok" : {
          "field" : "message",
          "patterns" : [ "%{TIMESTAMP_ISO8601:Log_Timestamp} %{HOSTNAME:Hostname} %{IP:Source_IP} %{WORD:Request_Method} %{URIPATH:URI_Stem} %{NOTSPACE:URI_Query} %{NUMBER:Port} %{NOTSPACE:Username} %{IP:Clienthost} %{NOTSPACE:Browser_Request} %{NUMBER:Status} %{NUMBER:Subresponse} %{NUMBER:SC_Status} %{NUMBER:Request_Time_ms}" ]
        }
      } ]
    }
  } ]
}

If I pass a log line like this one:
2016-06-28 19:26:36 machine-name 8.8.8.8 POST /home.aspx - 80 - 8.8.8.9 Mozilla/4.0 (compatible;+Win32;+WinHttp.WinHttpRequest.5) 200 0 0 20
through curl with the ?pipeline=filebeat_pipeline at the end of the request, and pull back the data I get a properly parsed 'message:' field. I don't, however, get a properly parsed 'message:' field through the normal log pushes. In kibana I still see the whole log in the 'message:' field.

I've ran filebeat.exe with -configtest and -e flags and no issues were reported. I'm not seeing anything anywhere in the logs on either the filebeat side, nor on the elasticsearch side.

I've also refreshed the field list.

The only thing I haven't done is restarted the cluster, but according to the documentation pipelines are updated without having to restart services.

Can you share on the unprocessed JSON documents which are stored in elasticsearch?

Somehow the indentation of your config looks of. Could also be a copy / paste error. Can you try the following config:

output.elasticsearch:
  hosts: ["elk-09:9200", "elk-10:9200"]
  parameters:
    pipeline: filebeat_pipeline

Config test does not detect if there are config options inside which should not be there.

Below is the unformatted json. I updated filebeat.yml and no changes in behavior. Also for what it's worth, I'm not doing this update across all my filebeat servers, just a single for testing, and once I establish working pipelines then I'll migrate the change across the board.

{
  "_index": "filebeat-2016.06.29",
  "_type": "log",
  "_id": "AVWb47nSIcnEGb44gJx0",
  "_score": null,
  "_source": {
    "@timestamp": "2016-06-29T11:18:22.402Z",
    "beat": {
      "hostname": "server-01",
      "name": "server-01"
    },
    "count": 1,
    "fields": null,
    "input_type": "log",
    "message": "2016-06-29 11:17:51 server-01 8.8.8.8 POST /home.html 
- 80 - 8.8.8.9 Mozilla/4.0+(compatible;+Win32;+WinHttp.WinHttpRequest.5) 200 0 0 127",
    "offset": 1453345,
    "source": "C:\\Windows\\System32\\LogFiles\\W3SVC1\\u_ex160629.log",
    "type": "log"
  },
  "fields": {
    "@timestamp": [
      1467199102402
    ]
  },
  "highlight": {
    "beat.name": [
      "@kibana-highlighted-field@QA@/kibana-highlighted-field@-@kibana-highlighted-field@server-01@/kibana-highlighted-field@-@kibana-highlighted-field@02@/kibana-highlighted-field@"
    ],
    "beat.hostname": [
      "@kibana-highlighted-field@QA@/kibana-highlighted-field@-@kibana-highlighted-field@server@/kibana-highlighted-field@-@kibana-highlighted-field@01@/kibana-highlighted-field@"
    ],
    "message": [
      "2016-06-29 11:17:51 @kibana-highlighted-field@QA@/kibana-highlighted-field@-@kibana-highlighted-field@server@/kibana-highlighted-field@-@kibana-highlighted-field@01@/kibana-highlighted-field@ 8.8.8.9 POST /home.html - 80 - 8.8.8.8 Mozilla/4.0+(compatible;+Win32;+WinHttp.WinHttpRequest.5) 200 0 0 127"
    ]
  },
  "sort": [
    1467199102402
  ]
}

can you try to get a trace via tcpdump?

Both filebeat and elasticsearch being version 5.0.0-alpha3?

yes, using the 5.0.0 Alpha 3 release on both filebeats and elasticsearch.

Also what specific information are you looking for from a tcpdump? I know messages are making it to the elastic cluster, and running curl -GET localhost:9200/_ingest/pipeline/filebeat_pipeline on each server shows the proper pipeline output. Filebeat is specifically shoving output to the 09 and 10 servers, and the 09 and 10 servers are set to be ingest nodes only, node.data and node.master are set to false.

I'm interested in the HTTP Request line (if URL is generated correctly) + the elasticsearch response code.

I'm getting a 200 response, and the URI is going to /_bulk. I guess that answers that!

So what's going wrong in my config that it's not going to _ingest, but _bulk?

ingest node doc says:

To use a pipeline, you simply specify the pipeline parameter on an index or bulk request to tell the ingest node which pipeline to use. For example:

That is the bulk request should look something like:

_bulk?pipeline=filebeat_pipeline.

Is the parameter present in your trace?

Here is some doc how to setup filebeat with ingest node.

Maybe there is a problem with indentation?

If I use curl from the command line I see the pipeline working as expected, my log is parsed successfully, but it's not working in the filebeat.yml configuration.

The parameter isn't in my trace, it goes to elk-10:9200/_bulk.

My configuration matches the configuration example in that link:

output.elasticsearch:
hosts: ["localhost:9200"]
parameters: {pipeline: my_pipeline_id}

My configuration:

output:
### Elasticsearch as output
   elasticsearch:
        hosts: ["elk-10:9200"]
        parameters: {pipeline: filebeat_pipeline}

Indentation was off in my initial post due to the forum formatting.

I tried adding the path in a number of areas and get 404's wherever I put it. I tried setting the host to hosts: ["http://elk-10:9200/_ingest/pipeline/filebeat_pipeline"] and I also set path: "/_ingest/pipeline/filebeat_pipeline". Both times I received 404's.

I have the same problem. Was able to tcpdump the requests from filebeat and noticed that there's no pipeline parameter in it. That is, it looks like
POST /_bulk HTTP/1.1
Host: 192.168.xx.xx:9200
User-Agent: Go-http-client/1.1
Content-Length: 2668
Accept: application/json
Accept-Encoding: gzip
instead of
POST /_bulk?pipeline=mypipe HTTP/1.1
Host: 192.168.xx.xx:9200
User-Agent: Go-http-client/1.1
Content-Length: 2668
Accept: application/json
Accept-Encoding: gzip

Tried with different configuration options for the elasticsearch.output.parameters value in filebeat config, namely

parameters: {pipeline: "mypipe"} (as shown in https://www.elastic.co/guide/en/beats/filebeat/master/configuring-ingest-node.html)

as well as

parameters:
pipeline: mypipe

or

parameters:
pipeline: "mypipe"

Hm, can you try with filebeat version 5.0-alpha2 and 5.0-alpha1?

alpha 2 download page: https://www.elastic.co/downloads/past-releases/filebeat-5-0-0-alpha2

alpha 1 download page: https://www.elastic.co/downloads/past-releases/filebeat-5-0-0-alpha1

In case it works with any of these two but not alpha3 or 4, please open a ticket in github. It's some issue for the pioneer program.

Works fine with Alpha 2. I'll check out Alpha 4 and open up a ticket if it doesn't work.

Link to the github bug for those who are interested: https://github.com/elastic/beats/issues/1962

Thanks guys for the quick responses!

Thanks for testing. Very much appreciated.

@fixone @petercort Please see https://github.com/elastic/beats/issues/1962 We tried to reproduce the issue but didn't manage.

This topic was automatically closed after 21 days. New replies are no longer allowed.