Usage of filestream

Ossenfeld · November 29, 2022, 8:07am

Hello,

I have a few question about the topic filestream and it's difference to the input log.

Do I only need an id when using multiple filebeat inputs in a single yml or always? Currently im not using any ids but im curious if i may run into trouble.
Can prospector options be nested e.g. like:

prospector
  scanner
    exclude_files: ...
    check_interval: ...

How does the exclude_files work? In the migrating to filestream docu (Step 2: Exclude all processed files | Filebeat Reference [8.5] | Elastic) they got:

  paths:
    - /var/log/my-application*.json
  prospector.scanner.exclude_files: my-application[1-2]{1}.log

Does this mean that my-application*.log is excluded from the path /var/log/ or where is the exclusion happening?

How are multiple excluded files separeted? i'd assume its ['file1_pattern', 'file2_pattern']?
I'm using scan_frequency with type filestream, so according to the renaming table (Step 3: Use new option names | Filebeat Reference [8.5] | Elastic), it's not working. do both options still work anyways or do i have to rename it in every .yml?
What is the difference between e.g. paths: /var/log/.log and include_files: /var/log/.log? When would i use one over the other?

Thanks in advance,
Ossenfeld

Ossenfeld · November 30, 2022, 9:18am

Any one on question 1 at least?

Kaiyan_Sheng · December 1, 2022, 11:18pm

@faec Could you help with these questions please? Thanks!!

Ossenfeld · December 5, 2022, 9:18am

Might it help if i open a thread for every question?

faec · December 6, 2022, 10:08pm

We recommend always using an id, but current releases will assign a default for single inputs.
Yes but prospector and scanner should be followed by a colon :
In filestream this parameter is a list so you should instead use exclude_files: ["my-application[1-2]{1}.log", "some-other-pattern..."]. The exclusions are regular expressions, and any file that matches that regular expression will not be ingested.
(see previous example)
No, if you switch to the filestream input then any instances of the old scan_frequency parameter should be replaced with prospector.scanner.check_interval
paths specifies where the input should look for possible files. If you want to ingest all files matching those paths, then there's no need to do anything else. If you want to only ingest some of those files, then adding a regular expression to include_files will only ingest files that are in one of the configured paths and match the given regular expression.

Ossenfeld · December 9, 2022, 10:04am

Exclude_files seems to be buggy I think. I used the two following filebeat configurations:

# vim: ft=yaml

- type: filestream
  id: json-collector
  paths:
  ¦ - /var/log/parser-testing/*
  fields:
  ¦ parser.test: "json_only"

  fields_under_root: true
  ignore_older: 30m
  close.on_state_change.inactive: 5m
  prospector:
  ¦ scanner:
  ¦ ¦ check_interval: 1s
  ¦ ¦ exclude_files: [".*.log"]
  parsers:
  ¦ - ndjson:
  ¦ ¦ ¦ keys_under_root: true
  ¦ ¦ ¦ expand_keys: true
  ¦ ¦ ¦ add_error_key: true

and

---
# vim: ft=yaml

- type: filestream
  id: log-collector
  paths:
  ¦ - /var/log/parser-testing/*
  fields:
  ¦ parser.test: "json_excluded"

  fields_under_root: true
  ignore_older: 30m
  close.on_state_change.inactive: 5m
  prospector:
  ¦ scanner:
  ¦ ¦ check_interval: 1s
  ¦ ¦ exclude_files: [".*.json"]
  parsers:
  ¦ - multiline:
  ¦ ¦ ¦ type: pattern
  ¦ ¦ ¦ pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  ¦ ¦ ¦ match: after
  ¦ ¦ ¦ negate: true

Files named xyz.log are collected, but xyz.json isn't. When removing the exclude_files, everything including xyz.json is collected. I could just specify the path like *.log and *.json, but I really would like to know what's going on with the exclude_files?

Ossenfeld · December 15, 2022, 7:00am

Any ideas?

Ossenfeld · January 4, 2023, 6:24am

Maybe one last bump

system · February 1, 2023, 8:25am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filestream multiple file reading issue Elastic Agent	15	272	June 24, 2025
Filebeat exclude_files is not working as expected Beats filebeat	7	2966	June 30, 2022
The filestream input Beats filebeat	2	398	September 22, 2022
Regex not worked with filebeat Beats filebeat	3	207	February 24, 2024
Can you exclude subfolder from being crawled completely in filebeat? Beats filebeat	2	40	August 19, 2025

Usage of filestream

Related topics