New Feature: reading chunks of data in harvester

r-ms · January 6, 2018, 5:14pm

Task:
I want to use Filebeat for reading and processing log files, which consist of fixed size structures. In other words, each consecutive 120 bytes in that kind of files represent new chunk of data.

I want to read them and slice into a fields using processors.

Idea:
I want to develop a new reader Chunk and add it to harvester's chain of readers:

limit -> (multiline -> timeout) -> strip_newline -> json -> encode -> (line XOR chunk) -> log_file

This reader will yield new chunks of fixed size and forward them to further steps.

What do you think of this idea, is it the right approach to solve initial task? Does anyone else need capability to read fixed structures from log files?

steffens · January 8, 2018, 4:43pm

Which service are you trying to monitor
Is this 'fixed' chunk all ASCII, or is some binary in there as well.

I was hoping to - one day - make the reader chain configurable. We don't want full parsing support, but chunking and different line splitting/multiline strategies could be implemented and reused in filebeat modules more easily.

r-ms · January 8, 2018, 5:59pm

Which service are you trying to monitor

These are SAP Security Audit files.

Is this 'fixed' chunk all ASCII, or is some binary in there as well.

It's a UTF16-encoded file. Each 200 characters in that file represents a structure, the first 20 bytes of which looks like that:

As you can see, this is text, but not ASCII.

Do you think, it's better not to try to invent the wheel and just wait for this feature?

steffens · January 9, 2018, 2:08pm

TBH, I don't think we're going to work on this feature anytime soon.

It's only the second time ever I'm seeing this use-case.

Contributions are very welcome.

Are these files being rotated? I wonder if it would make sense to define a special prospector type instead of modifying the reader chain.

system · January 27, 2018, 5:14pm

This topic was automatically closed after 21 days. New replies are no longer allowed.

Topic		Replies	Views
Filebeat log with no new line character Beats filebeat	1	283	July 5, 2022
Filebeat registory Beats filebeat	1	302	March 21, 2019
Tuning for harvesting a large number of files Beats filebeat	1	657	September 3, 2019
Single Line Logs - No newline character Beats filebeat	2	421	December 29, 2020
Filebeat Harvestor Behavior Beats filebeat	2	321	May 31, 2018

New Feature: reading chunks of data in harvester

Related topics