How filebeat read the content of the file

Hi all,

We are exporting a csv using cronjob and put it inside a a folder. A filebeat will process and read the file before for visualization in Kibana. (Filebeat and the folder are in the same host)

The csv is a very straight content, where inside they have two lines, header,a second line is the data.

The first time file being exported, filebeat process and we can see from discover dashboard. however, for subsequent job, filebeat seems 'ignore' it. alhtough every 15 minutes or so, the data inside the csv keep changing.

So i have been testing the following where both triggered filebeat to process csv.

  1. Edit the content like adding some number and save it (same name) and
  2. copy and rename into different name.

so what i did temporary now, before export, it will save into current date and time and then export it. This trigger filebeat to process because it sees a new file.

On the other hand, the same filebeat also read other directory, json file. But it is able to recognize the file changes and trigger filebeat to process the json file. The only different between csv and json is csv constantly having 4k size but the json always random, it can be from few mb to a gb.

This come to a question, on what situation we can trigger filebeat to process the file? initially i thought it based on timestamp of the file but that is seems not the case but the size might affect how filebeat process the file.

Hello @sixsenseninja !

Would be possible to check your filebeat configuration? Also, are the mechanisms used to update the JSON and CSV files the same? Do them get appended to, truncated, ... ?

Some more information will help to figure out what might be going on.

Thanks!

from filebeat yml configuration, it is indentical (just that the path are different for json and csv).

For the mechanism, the both json and csv were exported from remote server to these folders (csv and json folder).

Both are using similar technique. It is being processsed on the remote pc, and the output were send to filebeat servers using scp. the existing file were overwritten by the new file on every 15 minutes. the json file might have 10 or more lines depending on volume of data but for csv, it is always shows two lines (line header and line data)

i manage to solve this issue by resetting the state of the data/registry, and ask the filebeat to scan again the file.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.