Filebeat fails when I configure an ingest-pipeline on filebeat.yml

Hi. I have a filebeat working OK and shipping csv logs to ELK cloud.
I have created a new ingest-pipeline on ELK, that takes the message and create some fields:

[
  {
    "csv": {
      "field": "message",
      "target_fields": [
        "myapp-id",
        "myapp-url",
        "myapp-ip",
        "myapp-method",
        "myapp-request_params",
        "myapp-request_headers",
        "myapp-request_body",
        "myapp-response_code",
        "myapp-response_headers",
        "myapp-response_body",
        "myapp-created_at",
        "myapp-duration"
      ],
      "quote": "'",
      "ignore_missing": false
    }
  },
  {
    "date": {
      "field": "myapp-created_at",
      "formats": [
        "ISO8601"
      ]
    }
  },
  {
    "remove": {
      "field": "message"
    }
  }
]

Pipeline has been tested with a sample documente I've got from filebeat console output.

Now, I want all messages from this filebeat to be processed with this pipeline, so I configured filebeat.yml:

- type: log
  enabled: true
  paths:
    - /mnt/myfile_log.csv*
  fields:
    app: midlayer-pro
  multiline.type: pattern
  multiline.pattern: '^\d,'
  multiline.negate: true
  multiline.match: after
  pipeline: "my-custom-pipeline"

When I add the line "pipeline", filebeat stop sending logs.
Am I missing something? I have read the documents and examples and I can't find what is wrong. I have spent many hours testing without any progress.
Any help greatly appreciated. Thanks in advance.

Hi @feliperuiz Welcome to the community and thanks for providing this information and configs.

A couple things to investigate

What are the errors from filebeat when you start filebeat with the pipeline?

And when you start filebeat without the pipeline is it inserting documents correctly (not parsed) with the mult-line working and everything?

When you say ...

Pipeline has been tested with a sample documente I've got from filebeat console output.

How did you test this with _simulate or did you actually insert a few documents using the pipeline into the index?

Did you create a mapping?

Can you provide some sample documents?

Ohh and what version? :slight_smile:

Hi @stephenb . Thanks for taking the time for replying.
When I start filebeat without the pipeline, it is inserting documents OK, with the multiline working and joining all the message in one document.
The error I can see on syslog when filebeat fails is the following (I removed some text at the end because messages are very long):

Nov 26 14:19:34 server1 filebeat[32191]: 2021-11-26T14:19:34.503Z#011WARN#011[elasticsearch]#011elasticsearch/client.go:408#011Cannot index event publisher.Event
{Content:beat.Event{Timestamp:time.Time{wall:0xc060591c601221da, ext:1052765954, loc:(*time.Location)(0x5631e6d56dc0)}, Meta:null, 
Fields:{"agent":{"ephemeral_id":"a5e855ec-640b-420a-99fe-4dba40928a15","hostname":"server1","id":"d208eb89-6da3-4f09-ba13-5be85c5c3744","name":"server1","type":"filebeat","version":"7.13.2"}...........

To test the pipeline, I run filebeat with console output enabled, copied some lines and pasted on the GUI (kibana/stack management/Ingest Node Pipelines).

About mapping... that I didn't do it. Is it mandatory? If it is, how can I do it from the GUI?

Filebeat version 7.13.2
ELK cloud from Elasticsearch (I don't know how to see the version there)

Thanks

It could be several things...

Could you try this.

You will need a sample message like you cut and pasted and the name of the index that you are writing to like filebeat-7.13.2-000001

Go to Kibana -> Dev Tools

if you do not know the index you are writing to run

GET _cat/indices/file*/?v

Then we will try to actually insert a document by running the pipeline trying to write it to the index.

POST your-fiilebeat-index/_doc/?pipeline=my-custom-pipeline
{
   "message: : "your sample message here"
}

Let me know what the result is.

BTW you can run this and get the version

GET /

can you provide a couple sample documents?

You will want to create a mapping for production but we can get back to that in a bit...

OK I just re-read this ... I am a bit confused... why are you joining all the messages into 1 document? I am confused please provide the sample of your "message" as it arrives at the pipeline... your message should be 1 to 1 with respect to an elasticsearch documents that reach the pipeline.. 1 message equals 1 elasticsearch document ... perhaps I am just miss-understanding.

This is the document I am using to test the pipeline:

[{"_index":"filebeat-7.13.2-2021.11.26-000253","_id":"id","_source":
{
  "@timestamp": "2021-11-26T12:37:50.258Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.13.2"
  },
  "ecs": {
    "version": "1.8.0"
  },
  "host": {
    "name": "server1",
    "mac": [
      "xx:xx:xx:xx:xx:xx",
      "xx:xx:xx:xx:xx:xx"
    ],
    "hostname": "server1",
    "architecture": "x86_64",
    "os": {
      "name": "Ubuntu",
      "kernel": "4.15.0-159-generic",
      "codename": "bionic",
      "type": "linux",
      "platform": "ubuntu",
      "version": "18.04.5 LTS (Bionic Beaver)",
      "family": "debian"
    },
    "id": "123123123123123123",
    "containerized": false,
    "ip": [
      "123.123.123.123",
      "1234::1234:1234:1234:1234"
    ]
  },
  "agent": {
    "hostname": "server1",
    "ephemeral_id": "abc1234-1234-1234-1234-abc123abc123",
    "id": "d208eb89-1234-1234-1234-abc123abc123",
    "name": "server1",
    "type": "filebeat",
    "version": "7.13.2"
  },
  "cloud": {
    "service": {
      "name": "virtualmachine"
    },
    "provider": "Provider",
    "instance": {
      "id": "123123123"
    },
    "region": "abc3"
  },
  "container": {
    "id": "myfile_log.csv"
  },
  "log": {
    "offset": 3481619131,
    "file": {
      "path": "/mnt/myfile_log.csv"
    },
    "flags": [
      "multiline"
    ]
  },
  "message": "3,https://mysite.com/folder/qs?parameter1=ABC&page=1,,GET,null,'{\"Host\":[\"www.mysite.com\"],\"Authorization\":[\"Bearer abc-fgh-IJ\"],\"Accept\":[\"application\\/app.version.v5+json\"],\"Content-Type\":[\"application\\/app.version.v5+json\"],\"User-Agent\":[\"Agent\\/1.6.0\"]}',,200,'{\"Server\":[\"nginx\"],\"Date\":[\"Fri, 26 Nov 2021 12:19:09 GMT\"],\"Content-Type\":[\"application\\/app.version.v5+json;charset=UTF-8\"],\"Transfer-Encoding\":[\"chunked\"],\"Connection\":[\"keep-alive\"],\"Vary\":[\"Origin\",\"Access-Control-Request-Method\",\"Access-Control-Request-Headers\",\"Origin,Access-Control-Request-Method,Access-Control-Request-Headers\"],\"X-RateLimit-Limit\":[\"25\"],\"X-RateLimit-Remaining\":[\"24\"],\"X-RateLimit-Reset\":[\"51\"],\"x-request-id\":[\"123-abc-123\"],\"x-rid\":[\"123-abc\"],\"warning\":[\"299 - \\u0022Test API\\u0022\"],\"x-envoy-upstream-service-time\":[\"27\"],\"Strict-Transport-Security\":[\"max-age=86400\"],\"X-Content-Type-Options\":[\"nosniff\"],\"X-Frame-Options\":[\"SAMEORIGIN\"],\"X-XSS-Protection\":[\"1; mode=block\"]}','{\n  \"orders\" : [ {\n    \"wharehouse\" : \"123\",\n    \"orderPlacedDateTime\" : \"2021-11-26T10:47:23+01:00\",\n    \"orderItems\" : [ {\n      \"orderItemId\" : \"123\",\n      \"ean\" : \"123\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    } ]\n  }, {\n    \"wharehouse\" : \"123\",\n    \"orderPlacedDateTime\" : \"2021-11-26T07:47:14+01:00\",\n    \"orderItems\" : [ {\n      \"orderItemId\" : \"1234\",\n      \"ean\" : \"1234\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    } ]\n  }, {\n    \"wharehouse\" : \"345\",\n    \"orderPlacedDateTime\" : \"2021-11-25T22:22:29+01:00\",\n    \"orderItems\" : [ {\n      \"orderItemId\" : \"345\",\n      \"ean\" : \"123\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    }, {\n      \"orderItemId\" : \"345\",\n      \"ean\" : \"345\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    } ]\n  } ]\n}',2021-11-26T12:19:09+00:00,88",
  "input": {
    "type": "log"
  }
}
}]

I have created that document it by running filebeat with output:console and pasting one line in the middle of this code:

[{"_index":"filebeat-7.13.2-2021.11.26-000253","_id":"id","_source":
<FILEBEAT OUTPUT>
}]

When I use it on the pipeline as a test document, the result is OK, the pipeline identifies the fields and creates them with the corresponding fieldnames.

About this question. My logfile is created by an application that uses several lines per log, so I have to join those lines in one big line and then ship it to ELK. That is working OK. I can see the documents in Kibana where the "message" field contains all the information from those log lines.

POST your-fiilebeat-index/_doc/?pipeline=my-custom-pipeline
{
   "message: : "your sample message here"
}

This didn't work. I tried with:

"message": "3,https://mysite.com/folder/qs?parameter1=ABC&page=1,,GET,null,'{\"Host\":[\"www.mysite.com\"],\"Authorization\":[\"Bearer abc-fgh-IJ\"],\"Accept\":[\"application\\/app.version.v5+json\"],\"Content-Type\":[\"application\\/app.version.v5+json\"],\"User-Agent\":[\"Agent\\/1.6.0\"]}',,200,'{\"Server\":[\"nginx\"],\"Date\":[\"Fri, 26 Nov 2021 12:19:09 GMT\"],\"Content-Type\":[\"application\\/app.version.v5+json;charset=UTF-8\"],\"Transfer-Encoding\":[\"chunked\"],\"Connection\":[\"keep-alive\"],\"Vary\":[\"Origin\",\"Access-Control-Request-Method\",\"Access-Control-Request-Headers\",\"Origin,Access-Control-Request-Method,Access-Control-Request-Headers\"],\"X-RateLimit-Limit\":[\"25\"],\"X-RateLimit-Remaining\":[\"24\"],\"X-RateLimit-Reset\":[\"51\"],\"x-request-id\":[\"123-abc-123\"],\"x-rid\":[\"123-abc\"],\"warning\":[\"299 - \\u0022Test API\\u0022\"],\"x-envoy-upstream-service-time\":[\"27\"],\"Strict-Transport-Security\":[\"max-age=86400\"],\"X-Content-Type-Options\":[\"nosniff\"],\"X-Frame-Options\":[\"SAMEORIGIN\"],\"X-XSS-Protection\":[\"1; mode=block\"]}','{\n \"orders\" : [ {\n \"wharehouse\" : \"123\",\n \"orderPlacedDateTime\" : \"2021-11-26T10:47:23+01:00\",\n \"orderItems\" : [ {\n \"orderItemId\" : \"123\",\n \"ean\" : \"123\",\n \"quantity\" : 1,\n \"quantityShipped\" : 0,\n \"quantityCancelled\" : 0\n } ]\n }, {\n \"wharehouse\" : \"123\",\n \"orderPlacedDateTime\" : \"2021-11-26T07:47:14+01:00\",\n \"orderItems\" : [ {\n \"orderItemId\" : \"1234\",\n \"ean\" : \"1234\",\n \"quantity\" : 1,\n \"quantityShipped\" : 0,\n \"quantityCancelled\" : 0\n } ]\n }, {\n \"wharehouse\" : \"345\",\n \"orderPlacedDateTime\" : \"2021-11-25T22:22:29+01:00\",\n \"orderItems\" : [ {\n \"orderItemId\" : \"345\",\n \"ean\" : \"123\",\n \"quantity\" : 1,\n \"quantityShipped\" : 0,\n \"quantityCancelled\" : 0\n }, {\n \"orderItemId\" : \"345\",\n \"ean\" : \"345\",\n \"quantity\" : 1,\n \"quantityShipped\" : 0,\n \"quantityCancelled\" : 0\n } ]\n } ]\n}',2021-11-26T12:19:09+00:00,88",

Thanks.

EDIT:
This is the error I get when I try it

{
  "error" : {
    "root_cause" : [
      {
        "type" : "parse_exception",
        "reason" : "Failed to parse content to map"
      }
    ],
    "type" : "parse_exception",
    "reason" : "Failed to parse content to map",
    "caused_by" : {
      "type" : "json_parse_exception",
      "reason" : "Unexpected character ('}' (code 125)): was expecting double-quote to start field name\n at [Source: (byte[])\"{\r\n  \"message\": \"3,https://mysite.com/folder/qs?parameter1=ABC&page=1,,GET,null,'{\\\"Host\\\":[\\\"www.mysite.com\\\"],\\\"Authorization\\\":[\\\"Bearer abc-fgh-IJ\\\"],\\\"Accept\\\":[\\\"application\\\\/app.version.v5+json\\\"],\\\"Content-Type\\\":[\\\"application\\\\/app.version.v5+json\\\"],\\\"User-Agent\\\":[\\\"Agent\\\\/1.6.0\\\"]}',,200,'{\\\"Server\\\":[\\\"nginx\\\"],\\\"Date\\\":[\\\"Fri, 26 Nov 2021 12:19:09 GMT\\\"],\\\"Content-Type\\\":[\\\"application\\\\/app.version.v5+json;charset=UTF-8\\\"],\\\"Transfer-Encoding\\\":[\\\"chunked\\\"],\\\"Connection\\\":[\\\"k\"[truncated 1673 bytes]; line: 3, column: 2]"
    }
  },
  "status" : 400
}

It's strange that it works when I test it manually but it doesn't when I do it with the API

That is a complex message you need to be really carefull when cutting and pasting.

Please try exactly this and let me know what you get... do not format it do not do anything just cut and paste into dev tools and run it

POST filebeat-7.13.2-2021.11.26-000253/_doc/?pipeline=my-custom-pipeline
{
  "@timestamp": "2021-11-26T12:37:50.258Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.13.2"
  },
  "ecs": {
    "version": "1.8.0"
  },
  "host": {
    "name": "server1",
    "mac": [
      "xx:xx:xx:xx:xx:xx",
      "xx:xx:xx:xx:xx:xx"
    ],
    "hostname": "server1",
    "architecture": "x86_64",
    "os": {
      "name": "Ubuntu",
      "kernel": "4.15.0-159-generic",
      "codename": "bionic",
      "type": "linux",
      "platform": "ubuntu",
      "version": "18.04.5 LTS (Bionic Beaver)",
      "family": "debian"
    },
    "id": "123123123123123123543345643",
    "containerized": false,
    "ip": [
      "123.123.123.123",
      "1234::1234:1234:1234:1234"
    ]
  },
  "agent": {
    "hostname": "server1",
    "ephemeral_id": "abc1234-1234-1234-1234-abc123abc123",
    "id": "d208eb89-1234-1234-1234-abc123abc123",
    "name": "server1",
    "type": "filebeat",
    "version": "7.13.2"
  },
  "cloud": {
    "service": {
      "name": "virtualmachine"
    },
    "provider": "Provider",
    "instance": {
      "id": "123123123"
    },
    "region": "abc3"
  },
  "container": {
    "id": "myfile_log.csv"
  },
  "log": {
    "offset": 3481619131,
    "file": {
      "path": "/mnt/myfile_log.csv"
    },
    "flags": [
      "multiline"
    ]
  },
  "message": "3,https://mysite.com/folder/qs?parameter1=ABC&page=1,,GET,null,'{\"Host\":[\"www.mysite.com\"],\"Authorization\":[\"Bearer abc-fgh-IJ\"],\"Accept\":[\"application\\/app.version.v5+json\"],\"Content-Type\":[\"application\\/app.version.v5+json\"],\"User-Agent\":[\"Agent\\/1.6.0\"]}',,200,'{\"Server\":[\"nginx\"],\"Date\":[\"Fri, 26 Nov 2021 12:19:09 GMT\"],\"Content-Type\":[\"application\\/app.version.v5+json;charset=UTF-8\"],\"Transfer-Encoding\":[\"chunked\"],\"Connection\":[\"keep-alive\"],\"Vary\":[\"Origin\",\"Access-Control-Request-Method\",\"Access-Control-Request-Headers\",\"Origin,Access-Control-Request-Method,Access-Control-Request-Headers\"],\"X-RateLimit-Limit\":[\"25\"],\"X-RateLimit-Remaining\":[\"24\"],\"X-RateLimit-Reset\":[\"51\"],\"x-request-id\":[\"123-abc-123\"],\"x-rid\":[\"123-abc\"],\"warning\":[\"299 - \\u0022Test API\\u0022\"],\"x-envoy-upstream-service-time\":[\"27\"],\"Strict-Transport-Security\":[\"max-age=86400\"],\"X-Content-Type-Options\":[\"nosniff\"],\"X-Frame-Options\":[\"SAMEORIGIN\"],\"X-XSS-Protection\":[\"1; mode=block\"]}','{\n  \"orders\" : [ {\n    \"wharehouse\" : \"123\",\n    \"orderPlacedDateTime\" : \"2021-11-26T10:47:23+01:00\",\n    \"orderItems\" : [ {\n      \"orderItemId\" : \"123\",\n      \"ean\" : \"123\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    } ]\n  }, {\n    \"wharehouse\" : \"123\",\n    \"orderPlacedDateTime\" : \"2021-11-26T07:47:14+01:00\",\n    \"orderItems\" : [ {\n      \"orderItemId\" : \"1234\",\n      \"ean\" : \"1234\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    } ]\n  }, {\n    \"wharehouse\" : \"345\",\n    \"orderPlacedDateTime\" : \"2021-11-25T22:22:29+01:00\",\n    \"orderItems\" : [ {\n      \"orderItemId\" : \"345\",\n      \"ean\" : \"123\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    }, {\n      \"orderItemId\" : \"345\",\n      \"ean\" : \"345\",\n      \"quantity\" : 1,\n      \"quantityShipped\" : 0,\n      \"quantityCancelled\" : 0\n    } ]\n  } ]\n}',2021-11-26T12:19:09+00:00,88",
  "input": {
    "type": "log"
  }
}

That message succeded. This is the response

{
  "_index" : "filebeat-7.13.2-2021.11.26-000253",
  "_type" : "_doc",
  "_id" : "OXy8XX0BOBeEDuA2WSXx",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 3,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 1284932,
  "_primary_term" : 1
}

Ok so that is good... (and bad not sure why it is not working)

do this

GET filebeat-7.13.2-2021.11.26-000253/_doc/OXy8XX0BOBeEDuA2WSXx

Does it look like you suspect / want?

Indeed, that is what I want to accomplish!

Now the missing part is how to achieve it from while sending documents from filebeat.

Ok goooood.....

So please post your entire filebeat.yml

Also how do you actually know you're not processing the logs for example you could be processing most of them and then there could be some messages that do not get parsed correctly and then they failed to be ingested / indexed

Are you sure none of the messages are getting ingested or is it just some that are failing?

None of them are received. I am checking in Kibana Discover and I can't see any new ones. If I remove the pipeline line, it starts to receive with the message unformatted.
And remember that I start to see those messages on syslog, that seem to be one per each log processed.

Here is my filebeat.yml config:

filebeat.inputs:
- type: log
  enabled: false
  paths:
    - /var/log/*.log
- type: filestream
  enabled: false
  paths:
    - /var/log/*.log
- type: log
  enabled: true
  paths:
    - /mnt/myfile_log.csv*
  fields:
    app: midlayer-pro
  multiline.type: pattern
  multiline.pattern: '^\d,'
  multiline.negate: true
  multiline.match: after
  pipeline: "my-custom-pipeline"
- type: log
  enabled: true
  paths:
    - /mnt/my_otherfile_log.csv*
  fields:
    app: midlayer-pro
    event.dataset:
- type: log
  enabled: true
  paths:
    - /mnt/my_other_other_file_log.csv*
  fields:
    app: midlayer-pro
    event.dataset:

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 1

setup.kibana:
cloud.id: "myapp-logs:my_id"
cloud.auth: "elastic:secret"

output.elasticsearch:
  hosts: ["localhost:9200"]

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

Are you sure you're looking at the proper time range? You're overriding the timestamp in the pipeline so if it's in the past you'll have to go look for further back.

You could temporarily take out the date processor in your pipeline and see if you're getting the documents.

Also I know that you only have the pipeline for one of the inputs Do you want it on all of them or just the one that you have it.?

Also how are you starting filebeat We can turn up the debugging if needed.

It's something simple at this point

1 Like

I will check the timestamp. I hope that isn't the problem or I have wasted your time and mine.
The other inputs aren't beeing used (empty csv files), but for the sake of completeness I included them.
I am starting filebeat with systemctl, but I can run it by hand. Let me test the timestamp thing and I'll come back.

Perhaps timezone..

Starting by hand you can add -d "*" it will be very verbose...

Also you are aware that filebeat will only try to load a file once... it will not re-read unless specifically told to

and... there they are! :grimacing:
I can see the documents with a -9 hours difference...
when I saw the syslog errors I thought that there was a problem. I will have to see how to avoid those syslogs because it's a 10GB log file every day and it will eat all my disk.
Anyway, thanks @stephenb you have been very helpful and I owe you a beer.

1 Like

Good to know plus Hopefully you learned some skills along the way!!!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.