Filebeat not uploading lastest ingest grok pattern to elasticsearch

I'm trying to test out an issue with my grok pattern (specifically, it seems to be ending parsing of the field on a space, despite the pattern requiring that a } be at the end), but it seems no matter what I put in the json file for my ingest pipeline, elasticsearch is continuing to use the previous definition.

I've even tried changing the field names and various other changes that should cause it to fail to parse the log lines properly, but it continues to ingest using the previous defintion.
I even tried updating the module version number in the manifest.yml for the fileset I'm changing, but to no avail. I'm not even using the filebeat service - just running in immediate mode from the command line.

There's nothing in the filebeat logs that seem to shed any light on this, and a quick look over the source code doesn't reveal anything obvious.

Any clues as to what I might need to do to trigger filebeat to tell elasticsearch to update its pipeline definition? Can I give the pipeline a specific name?

To follow up, I was able to finally get it working by renaming the json file and changing manifest.yml to point to the new name.
Then I discovered there's an option "filebeat.overwrite_pipelines: true" for filebeat.yml, which also solves the problem.
But it strikes me as very unexpected behaviour that if I just change the file that defines the pipeline, it doesn't get updated. Surely it should do some sort of hash check to see if the definition needs updating?

Hi,

You haven't given a lot of details about what file your updating specifically and why you think filebeat should detect that you changed it. Unless I'm mistaken, you're updating a file that filebeat really doesn't expect you to modify, in the "module" folder?
The expectation is that "modules.d" is user configuration but the "module" folder is like... burned in, at compile time. Furthermore the fields created by an ingest pipeline are defined in the fields.yml file in the root folder of filebeat, this is also not "updated" if you update a pipeline in a module by hand. Meaning that if you changed a field name or added a new field, it would not be added to the index template that is generated based on the content of fields.yml.
Feel free to point me to some documentation that you think instructs you to change the files in that directory.

I think the option to overwrite pipeline on every connection to elasticsearch is for when it's possible for your setup to instantly target a new elasticsearch cluster that doesn't have the pipeline. Disaster recovery, blue-green, cut-over via DNS update, extremely dynamic environment where the es cluster goes away and comes back brand new, etc.
And not because it's expected that you will modify files under the "module" directory.

Creating a new filebeat module

ref

The ingest pipelines used to parse log lines are set up automatically the first time you run the module, assuming the Elasticsearch output is enabled.

ref.

To load them "manually/on demand" you could use the setup command.
ref.

In short, maybe a dev from Elastic will be able to chime in, but I think you're editing a file under the module directory that the software considers static. Filebeat documentation doesn't say you can add modules under the "module" directory OR modify them, on the contrary it says you need to do that before compiling filebeat.
I don't think you should do that at all, even if you think you found a hack that makes it works with the overwrite. If you change the fields, you'll have another problem with the index template. Also did you checked what happens when you update to the next patch level version, will your change be removed by the upgrade (e.g. RPM package update via YUM or similar). That would also throw a curveball.

I hope this helps, let me know if you think I'm wrong,

Martin

The file I'm editing is under modules.d, i.e a custom module.
The only file causing me an issue is the json file pointed to by the ingest_pipeline setting. Everything else gets picked up automatically (including, as I said, even that json file providing I rename it). I'm not modifying field names, just the grok patterns to parse each entry into those fields.
Anyway I think I found the answer - you're supposed to run filebeat setup --pipelines. --modules system if you've changed them.

I'm curious to understand what your doing to see if it could be useful to me than.
Your editing a json file under modules.d which is a pipeline definition file?
No such file is there normally, so you put it there yourself?
Is there also a module by that name in the module directory?

By custom module, what do you mean? Your customizing a module that came with filebeat... changing the config of the module? Or you created an entirely new module and you've put all of it's files under the modules.d directory?

Can you share the files and their location, the hierarchy?
I didn't know you could put pipeline definition files under modules.d, I feel I'm missing something interesting.

Martin

Sorry I was wrong, the file I was modifying was under /usr/share/filebeat/module/{my-module}/{fileset}/.
Interestingly, when the server is first spun up with the files under this directory, it just runs

systemctl enable filebeat
/bin/filebeat modules enable nginx
systemctl restart filebeat

But nginx is NOT the name of our custom module (and in fact on this particular server, we don't even use the nginx module).
I'm not sure if that is necessarily enough to allow any existing pipelines to get updated, but I've already spent more time on this that I can justify. filebeat.overwrite_pipelines: true seems to work just fine, and hasn't caused any performance or other issues I'm aware of.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.