Filebeat: recursively fetch all files in all subdirectories of a directory


(Stecino) #1

I thought i read somewhere that filebeat 5.0.0 alpha has addressed this. But now after implementation it says it doesn't


(Stecino) #2

Since I wanted to go one directory down appending //.log did the trick.
But it would be nice for future filebeat releases to have recursive support. Splunk forwarder for example handles this well :slight_smile:


(ruflin) #3

I remember there were discussion for this feature but I don't there is somewhere state that this is part of 5.0. We currently rely on the glob implementation from golang which has the above limitation. Going with a recursive approach can have the affect that a very large amount of files are crawled by one prospector, splitting it up to multiple prospectors give us the ability to somehow bring this under control (no implementation either here yet).

Could you share some more details on the structure of your log files and why you would need the recursive file fetching? If adding a feature, it is always good to understand the use case.


(Amit Bhatnagar) #4

Hi Ruflin,
Greetings of the day...

I am Amit and currently doing a POC with Filebeat + ELK stack to explore it's features to make sure that it suits with our requirements if we may implements the same with our products.

I also has the same requirement like Stecino described above. I do understand that current version of Filebeat does not support this feature (fetch all files recursively in all sub-folders of a folder) but right now is there is any workaround available for the same? If yes then please let us know.

I will try to explain this in details as much as I may...

We have more than 100 web based applications (numbers are keep growing) which generates two types of logs which are 'IIS logs' and 'application logs'. These applications are hosted in more than one Environment like PROD, DEV, SIT and UAT. Predefined path for both Logs are on each ENV are:

E:/LogMonitor/[Environment]/Application/[APP_NAME_YYMMDD.txt
E:/LogMonitor/[Environment]/IIS/W3SVC[APP_Pool_ID].log //App_Pool_Id is dynamic and will be generated when new application will be hosted in IIS.

We ensures that I our system will never generate the duplicate files in both file locations mentioned above so now I want to define the prospectors in Filebeat yml file like below:

E:/LogMonitor//.txt
E:/LogMonitor//.log

Note: I tried this with single and two prospectors but it did not work and then when search the documentation then I came to know this feature is not supported yet.

I want to achieve this scenario because of two reasons:

  1. I don't want to go to my production support team every time to define a new prospector for a newly hosted application which consumes plenty of time in our cases due to taking approvals from multiple channels.
  2. There may be chance that I have to host applications in an new pre-production environment. E.g. a new OAT ENV. In this case need to add a prospector manually.

Please suggest.

BR//
Amit


(Monica Sarbu) #5

If you know all the available environments [Environement], a solution that I can think of is to create a prospector for each environment. Please correct me if I didn't understood correctly.

1rst prospector:

  • E:/LogMonitor/PROD/Application/*.txt
  • E:/LogMonitor/PROD/IIS/*.log

2nd prospector:

  • E:/LogMonitor/DEV/Application/*.txt
  • E:/LogMonitor/DEV/IIS/*.log

3rd prospector:

  • E:/LogMonitor/SIT/Application/*.txt
  • E:/LogMonitor/SIT/IIS/*.log

4th prospector:

  • E:/LogMonitor/UAT/Application/*.txt
  • E:/LogMonitor/UAT/IIS/*.log

(Amit Bhatnagar) #6

Hi Monica,

Thanks for your inputs but this solution does not suits with my requirements. Actually each IIS folder has more than 100 sub folders and number is keep increasing because every time we host a new application it creates a new folder under IIS folder. It is true for all Env.

This is why I am looking for a workaround that can read all log files from all sub-folders with single prospector.

In my case it should be like below:

D:/LogFiles/Dev/IIS//.log
Or
D:/LogFiles//.log. (Preferred option)

Many Thanks.

Amit.


(ruflin) #7

I haven't tried it, but I actually thought the following should work for you:

E:/LogMonitor/*/Application/*.txt
E:/LogMonitor/*/IIS/W3SVC*.log

What is not supported, that you can defined E:/LogMonitor/** and it will crawl all sub directories.


(Amit Bhatnagar) #8

Many thanks Ruflin. Wildcard solution work for me. I was not aware about this awesome workaround.

Thanks a ton again.


(sruthi) #9

Hi all,

I am using Filebeat to read logs from my PC and send it out to logstash.
I have folder and sub folders in a random order. There is no specific order of sub directories which contain the logs.
So can you suggest a way to use Filebeat to handle such situation.

Folder1\Folder2\log1.txt
Folder1\Folder2\Folder3\Folder4\log1.txt

So i cannot explicitly mention the pattern of sub folders. Not to forget all the folders here also have logs that have to be read with the sub folders. Filebeat fails to do that using **

Any help is appreciated.


(ruflin) #10

This is currently not supported. Have a look at https://github.com/elastic/beats/issues/2084


(sruthi) #11

Any workaround to get this done?? Or is there any other option or tool?


(ruflin) #12

Perhaps you can tackle this from the other way. Why does your application create log files in random folder structures?


(sruthi) #13

There are different ways to pull the logs from the application.
So it creates different file structures.
We cannot handle it from the file structure side. Any other option that can help us?
Is that even going to be a considered requirement for filebeat in the future release?


(system) #14