I've configured logstah to read multiple csv files, each one has a different columns set (different column names), it reads the first one successfully, but it never reads the columns for any csv after reading the first one!
autodetect_column_names uses the very first event that logstash sees to determine the column names and never changes them. You could count columns or else do a conditional based on the filename.
Thanks for the reply, but what would I do with the fields number or the conditional on having the file's name changed in the path field?
Sorry, I didn't catch your drift!
because the listener might have files at anytime and with same # of columns but different names, and I can't have a hard-code if else condition, like the one in the link you've put in the reply.
If you want to ingest files with different formats (different sets of columns) the processing has to be conditional on which format a given event has. If you do not want to use if/else then ingest files one at a time.
can you give me an example about this part? because the files won't be ingested at the same time, I might have one file to ingest now and three files after X minutes and so on.
That's the post I linked to. In that case the conditional was based on the number of columns, but it could also be based on a regexp match to the filename.
If you want to use autodetect_column_names then you would have to run logstash one time for each file, which is going to be very expensive for small files (it takes logstash about 45 seconds of CPU to start up on my system).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.