I have a few question about the topic filestream and it's difference to the input log.
Do I only need an id when using multiple filebeat inputs in a single yml or always? Currently im not using any ids but im curious if i may run into trouble.
We recommend always using an id, but current releases will assign a default for single inputs.
Yes but prospector and scanner should be followed by a colon :
In filestream this parameter is a list so you should instead use exclude_files: ["my-application[1-2]{1}.log", "some-other-pattern..."]. The exclusions are regular expressions, and any file that matches that regular expression will not be ingested.
(see previous example)
No, if you switch to the filestream input then any instances of the old scan_frequency parameter should be replaced with prospector.scanner.check_interval
paths specifies where the input should look for possible files. If you want to ingest all files matching those paths, then there's no need to do anything else. If you want to only ingest some of those files, then adding a regular expression to include_files will only ingest files that are in one of the configured paths and match the given regular expression.
Files named xyz.log are collected, but xyz.json isn't. When removing the exclude_files, everything including xyz.json is collected. I could just specify the path like *.log and *.json, but I really would like to know what's going on with the exclude_files?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.