I've been trying to figure out how to solve this: I have an Amazon S3 bucket with a lot of files and filetypes. I only care about the xml files. I can't figure out how to have it only process xml files. It keeps trying to process every file and because of that the images keep causing charset errors.
prefix can't since the only pattern is the suffix of .xml. I tried using exclude_pattern but looking at the logs I didn't see it output anything using stdout. Am I right in thinking exclude_pattern is for the filename? It says "key" in the docs and I wasn't sure if that was the same thing.
I'm thinking the issue might be the regex is wrong as well. I tried yours and it isn't processing anything either.
Isn't matching file types / extensions something that most people would need to do with Logstash? I feel like it would be very useful to have a built in way to say "only process files with this or that type" on the level of a codec or something.
Isn't matching file types / extensions something that most people would need to do with Logstash? I feel like it would be very useful to have a built in way to say "only process files with this or that type" on the level of a codec or something.
It's not an unreasonable request; feel free to file a GitHub issue.
Looks like the site you used uses JavaScript regex, but the S3 input plugin requires Ruby regex. So complicated, and still not working. It's odd that I can't find anyone else who has had this issue.
Also, I am somewhat confused about how the regex exclude_pattern field is supposed to work. For example, if I have regex that matches part of the filename but not the entire thing, will that file be excluded? Does it have to match the entire filename to take effect?
I don't think there are any implicit anchors, i.e. if the given expression matches the filename string that file is excluded. So partial match, if you will.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.