I'm trying to get acquainted with the ELK platform and trying to understand how the different modules interact with each other.
Besides the (internal) ip addresses I would like the result of a dns lookup in the docs that end up in ES. The logs I am referring to are the ones from Zeek that are shipped to ES using Filebeat. This uses the Zeek module for Filebeat.
I found some documentation on processors that can be used with filebeat. There is even a special processor for DNS lookups that seems to do exactly what I am looking for.
I've added the required yaml entries to the filebeats.yml but nothing happens.
My question is whether I'm on the right path, can processors in the filebeat configuration be used when using a filebeats module (in this case the zeek module) or will this not work because the whole processing is being done by the module and therefore bypassing the standard pipeline.
If this is not going to work, any other suggestions how to enrich these log records with dns lookups?
The document you linked suggests to do an ingest on the ES node right?
How would I setup the filebeat in order to achieve this? Is this something I need to configure inside the standard ES config?
I cant seem to find the location (on the filebeat machine) how to manipulate the standard processing as being done by the standard filebeat zeek module.
With your reference to "Ingest node" I noticed within the ES configuration under Stack Management the "Ingest Node Pipelines" menu option. Within this menu a separate "Ingest Node Pipeline" exists per zeek logfile. Hmm, is that where the enrichment and fieldmapping magic happens? Is al the processing being done on the Elastic node itself...makes sense you should appoint dedicated ingest nodes as stated in the document.
I'm going to try whether making adjustments in the ingest pipeline is perhaps the way to do this.
These are obviously newbee questions and that is so true. Please point me in the right direction to prevent make all the beginner mistakes.
Please, can you format the conf you pasted, the forum is markdown compatible? In yaml it's very important to see if the configuration is correctly formatted and you'll be surprised of how many times it's just an indentation issue
another beginner mistake. Retry attempt of my yaml underneath.
I dug a little further into the configuration. I belief that the predefined modules do not look at the main /etc/filebeat/filebeat.yml file.
I found that for each (in this case Zeek) logfile within a module a separate connection.yml exists. This yaml file describes the way the fields are parsed and renamed by filebeat.
The second stage occurs at Elasticsearch by the Ingest Node Pipeline. This pipeline is created by filebeat during the setup and is created based on a template which is present in the /usr/share/zeek//ingest/pipeline.yml. In the second stage enrichment with GeoIP is done (amongst others).
My conclusion is that the enrichment with DNS reverse lookups should be added in the 2nd stage (but could in theory also be done during the first stage by Filebeat).
I noticed that the Ingest Pipeline at Elasticsearch is created in JSON which makes editing a little more difficult (at leat for me).
My Question remains what the recommended way is to make this adjustment, there seem to be multiple scenario's:
Scenario 1: Let the enrichment occur at Filebeat (stage1) by adding the additional processor to the connection.yml
Scenario 2: Let the enrichment occur at Elastic (stage 2) by adding the additional processor in Yaml to the ingest\pipeline.yml file on the filebeat machine and re-run the setup.
Scenario 3: Let the enrichment occur at Elastic (stage 2) by adding the additional processor in JSON to the Ïngest Node Pipeline".
If I do the adjustment as described in scenario 1 or 2, what happens when filebeat gets updated, will this overwrite my customizations?
As mentioned in previous post, I thought there were 3 possible scenario's.
It turned out that scenario 2 and 3 will not work since the DNS processor only exists for Filebeat and doesn't for Elastic.
I did manage to get the first scenario up and running by adding the DNS processor entries to the module specific connection.yml in the processor section.
Path: /usr/share/filebeat/module/zeek/connection/config/connection.yml
This is the yaml I've added in the processor section to get it to work:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.