I am setting up a log file processor using ELK stash. I need a unique ID so I can use it as document_id - there can be duplicate lines in the app access log so all I need is the file name and the line number which is unique so I can use the same as document_id. How do I get the file name and the line number of the file being processed in logstash?
You know that ES will assign a unique document id based on the message contents, right?
The path
field contains the path to the input file but there's no field for the line number nor the file offset. If all you need is something unique and ES's autoassignment of document ids won't do I suggest you look at the fingerprint filter.
I am using that but if another line is identical then it filters that out which is not what I want since it indicates less requests processed. I suppose there's no other if there's no line number - I will have to process the logs to add the line numbers.
The Logstash file input plugin does unfortunately not support supplying line number or file offset for each event. Filebeat is however able to provide the file offset of a line, which makes it possible to distinguish between identical lines in the file.