Combining Multiple Records in Logstash

I am working on ingesting pricing logs. A sample of the data I am ingesting looks like this:

D|535|00846U101|256||100|0.0001 1 0.01
B|1|1523318400.0108|62.28|1|72.21|1|0|k||200000005449000
B|2|1523318400.0245|62.28|1|||0|k||200000016741000
R|3|1523318400.1268|53.69|1|t|65.69|2|t|200000118254000
B|4|1523318400.151|53.69|1|t|||t|200000141347000
B|5|1523358000.0683|49.72|2|k|68|1|k|70000043528000
D|838|1523358000.9202|47.85|1|75|1|0|p||70000881083000
B|1|1523318400.0108|62.28|1|72.21|1|0|k||200000005449000cR|2|1523318400.0245|62.28|1|||0|k||200000016741000
R|3|1523318400.1268|53.69|1|t|65.69|2|t|200000118254000
B|4|1523318400.151|53.69|1|t|||t|200000141347000
D|338|1523358000.9202|47.85|1|75|1|0|p||70000881083000
R|1|1523318400.0245|62.28|1|||0|k||200000016741000
R|2|1523318400.1268|53.69|1|t|65.69|2|t|200000118254000
B|3|1523318400.151|53.69|1|t|||t|200000141347000

The first column identifies the type of log, and the bolded lines (D) serve to provide more information about the logs that follow. The second column is the log sequence (unless the first column is D; in that case, the second column is a sort of identifier number for D records). For all logs that have either 'R' or 'B' in the first column, their sequence number ascends numerically until the next 'D' record appears. Then, after that 'D' (which essentially labels a new log starting), the sequence restarts for the R or B records.

What I need is to be able to attach (tag) the information from the 'D' record to all of the 'R' and 'B' logs below that 'D' until the next 'D,' then repeat this pattern for the entire set of data.

I can give more context if necessary. Thanks!

You might be able to do that using an aggregate filter, but I would go with a straight ruby filter.

    if [message] =~ /^D/ {
        ruby { code => '@@lastDRecord = event.get("message")' }
    } else {
        ruby { code => 'event.set("lastDRecord", @@lastDRecord)' }
    }

Note that you must set '--pipeline.workers 1' for this to work. Also, you cannot set --experimental-java-execution since that will re-order the records and associate the wrong D record with each line. Is that a bug? I do not know.

Will try this.

When I ran this in my config, it ingested, but instead of every record containing info from the respective 'D,' each record was tagged with "_rubyexception"

Furthermore, this message appeared in repetition until the data began to go through:

[2018-07-31T16:05:13,637][ERROR][logstash.filters.ruby ] Ruby exception occurred: uninitialized class variable @@lastDRecord in LogStash::Filters::Ruby

How should I proceed? Thanks

I think I've got it to work. However, I would like it to add the columns from the previous D to the records below (not just the message).

Any suggestions?

Is the first line of the file a D record?

You can suppress the exception by testing whether @@lastDRecord ever got set

ruby { code => ' if defined? @@lastDRecord; event.set("lastDRecord", @@lastDRecord); end' }

If you are getting that exception literally on every line then it suggests that this is not matching

if [message] =~ /^D/ {

Is your data in a field called message?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.