How filebeat read the content of the file

sixsenseninja · January 28, 2022, 9:27am

Hi all,

We are exporting a csv using cronjob and put it inside a a folder. A filebeat will process and read the file before for visualization in Kibana. (Filebeat and the folder are in the same host)

The csv is a very straight content, where inside they have two lines, header,a second line is the data.

The first time file being exported, filebeat process and we can see from discover dashboard. however, for subsequent job, filebeat seems 'ignore' it. alhtough every 15 minutes or so, the data inside the csv keep changing.

So i have been testing the following where both triggered filebeat to process csv.

Edit the content like adding some number and save it (same name) and
copy and rename into different name.

so what i did temporary now, before export, it will save into current date and time and then export it. This trigger filebeat to process because it sees a new file.

On the other hand, the same filebeat also read other directory, json file. But it is able to recognize the file changes and trigger filebeat to process the json file. The only different between csv and json is csv constantly having 4k size but the json always random, it can be from few mb to a gb.

This come to a question, on what situation we can trigger filebeat to process the file? initially i thought it based on timestamp of the file but that is seems not the case but the size might affect how filebeat process the file.

marc.guasch · January 28, 2022, 11:03am

Hello @sixsenseninja !

Would be possible to check your filebeat configuration? Also, are the mechanisms used to update the JSON and CSV files the same? Do them get appended to, truncated, ... ?

Some more information will help to figure out what might be going on.

Thanks!

sixsenseninja · January 28, 2022, 2:44pm

from filebeat yml configuration, it is indentical (just that the path are different for json and csv).

For the mechanism, the both json and csv were exported from remote server to these folders (csv and json folder).

Both are using similar technique. It is being processsed on the remote pc, and the output were send to filebeat servers using scp. the existing file were overwritten by the new file on every 15 minutes. the json file might have 10 or more lines depending on volume of data but for csv, it is always shows two lines (line header and line data)

sixsenseninja · February 9, 2022, 2:53am

i manage to solve this issue by resetting the state of the data/registry, and ask the filebeat to scan again the file.

system · March 9, 2022, 4:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Live updating file Beats filebeat	1	262	April 20, 2021
Filebeat not detecting all changes made to csv File Beats filebeat	6	1087	July 27, 2018
Filebeat+CSV Beats filebeat	4	1072	January 16, 2019
Filebeat loading whole csv again, if new entries are added Beats filebeat	6	319	May 25, 2021
Reading a .csv file from a remote url using filebeat and httpjson Beats filebeat	1	383	March 13, 2023

How filebeat read the content of the file

Related topics