The ingest pipeline name is also declared in the yml config file.
My csv do not have any header for info.
Now when I launch the ingest I get the following in the logs:
ERROR [publisher] pipeline/client.go:106 Failed to publish event: failed to compute fingerprint: failed to find field [messageId] in event: key not found
I guess this is because filebeat doesn't know which csv column the messageId field is.
I don't seem to find how to declare my csv structure in the filebeat.yml file. Should I just add the field config declaring my csv columns in the right order? Will it understand the order of columns? Not fully sure what to do here.
If you are doing the csv parsing in Ingest Node then those fields will not exist when Filebeat is processing the event. You could use the Elasticsearch fingerprint processor instead.
Or move the csv parsing over to Filebeat with decode_csv_fields and extract_array. Then apply fingerprint afterward.
Ahh thanks @andrewkroh I was wondering how to use decode_csv_fields because it just returns an array, but then get to the actual field so you use the extract_array Good to know!
indeed using fingerprint upstream on the ES node itself is definitely going to work and do the trick, thanks for the tip!
I think it would cost less than using a processor in filebeat.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.