Filebeat fingerprint doesn't understand columns from csv files

Baygon · August 25, 2021, 2:06am

Hi,

I have a filebeat instance that sends event from CSV files to an ES cluster.
The ingest pipeline has the relevant mappings and everything works fine.

I'm trying to improve the pipeline for deduplication of event by adding a fingerprint to my processors:

processors:
  - fingerprint:
      fields: ["messageId"]
      target_field: "@metadata._id"

The ingest pipeline name is also declared in the yml config file.
My csv do not have any header for info.

Now when I launch the ingest I get the following in the logs:

ERROR [publisher] pipeline/client.go:106 Failed to publish event: failed to compute fingerprint: failed to find field [messageId] in event: key not found

I guess this is because filebeat doesn't know which csv column the messageId field is.
I don't seem to find how to declare my csv structure in the filebeat.yml file. Should I just add the field config declaring my csv columns in the right order? Will it understand the order of columns? Not fully sure what to do here.

stephenb · August 25, 2021, 2:50am

I don't think the decode_csv_fields processor works the way you think.

It will output the values as an array of strings

Perhaps dissect processor would be a better fit.

andrewkroh · August 25, 2021, 7:20pm

If you are doing the csv parsing in Ingest Node then those fields will not exist when Filebeat is processing the event. You could use the Elasticsearch fingerprint processor instead.

Or move the csv parsing over to Filebeat with decode_csv_fields and extract_array. Then apply fingerprint afterward.

stephenb · August 25, 2021, 7:44pm

Ahh thanks @andrewkroh I was wondering how to use decode_csv_fields because it just returns an array, but then get to the actual field so you use the extract_array Good to know!

Baygon · August 27, 2021, 2:44pm

indeed using fingerprint upstream on the ES node itself is definitely going to work and do the trick, thanks for the tip!
I think it would cost less than using a processor in filebeat.

system · September 24, 2021, 4:45pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can FileBeat parse csv file and send the data in fields to elastic search directly? Beats filebeat	5	10141	January 6, 2017
Parse csv log file using filebeat Beats filebeat	5	3205	April 8, 2019
Customized decode_csv_fields in filebeat Beats filebeat	1	639	July 22, 2020
How to siparate message string to fields? Kibana	8	4894	March 11, 2019
All csv columns are saved inside message field Logstash	2	413	July 1, 2020

Filebeat fingerprint doesn't understand columns from csv files

Related topics