Hi,
I'm trying to parse CSV files with Logstash. Some of the files have a column called time, others have two columns called time_from and time_to (and they have several other columns). Ideally I'd like to have one pipeline parsing all of these files and autogenerate the field names. I've tried both the CSV filter and the CSV codec and found the following behaviors:
With the CSV filter, if Logstash stops in the middle of a file and is restarted, it takes whatever is in the first unread line of the file as the headers and uses those for the field names. This is obviously not intended.
The behavior of the CSV codec is problematic in other ways: It uses the header of the first file for the field names and for all the other lines (from all the files), it generates a document (even for the header lines of the other documents). This is also not what I want.
What's the best way to solve my problem? Am I using the codec or the filter plugin incorrectly or is there another option I could try?
Thanks