Routing one input to diferent indexes

Hello all,

i don't know if this can be achieve with logstah. I am just starting with logstash.

This is the structure of each file that comes in the folder every day:

DATE: FEB 10, 2020

******* section 1 ********

COLUMN_1	COLUMN_2	COLUMN_3
12		mar 20 		3421			
15		ene 20 		1200			
40		mar 20 		2102
			
******* section 2 ********

COLUMN_1	COLUMN_2	COLUMN_3
17		ene 20 		3421			
18		feb 20 		1200			
20		mar 20 		2102

TOTAL              107,68  .00  7,830  68,123
SUBTOTAL           40,321

I need to route different parts (matches) of the same file in different indexes (csv files)

output 1 (routing events/lines to first_index.csv)

@timestamp = Extract the date of the file (top date in the content)
TOTAL = 107,68
SUBTOTAL = 40,321

output 2 (routing events lines to second_index.csv)

@timestamp = Extract the date of the file (top date in the content)
COLUMN_1 = 17
COLUMN_2 = ene 20
COLUMN_3 = 3421

@timestamp = Extract the date of the file (top date in the content)
COLUMN_1 = 18
COLUMN_2 = feb 20
COLUMN_3 = 1200

How can i accomplish this?

Best regards

I've just rewrote my question. I hope it is more clear now.

Best regards.

I would ingest the entire file as a single event and then process it two different ways. With any recent release of logstash you would do that using multiple pipelines and a forked-path pattern. However, I will use the old school method.

Note the use of literal newlines in patterns.

grok { match => { "message" => "^DATE: %{DATA:[@metadata][date]}
" } }
date { match => [ "[@metadata][date]", "MMM dd, yyyy" ] }
clone { clones => [ "section2" ] }
if [type] != "section2" {
    grok {
        match => { "message" => [ "TOTAL\s+(?<total>[0-9,]+)", "SUBTOTAL\s+(?<subtotal>[0-9,]+)" ] }
        break_on_match => false
        remove_field => [ "message" ]
    }
} else {
    # The (?m) enables multiline matches for the regexp
    mutate { gsub => [ "message", "(?m)^TOTAL.*", "", "message", "(?m).*section 2 \*+\n", "" ] }
    mutate { split => { "message" => "
" } }
    ruby {
        code => '
            events = []
            msg = event.get("message")
            msg.each_index { |x|
                if x == 0
                    @column_names = msg[x].split(/\t/)
                else
                    columns = msg[x].split(/\t\s+/)
                    events << { @column_names[0] => columns[0],
                                @column_names[1] => columns[1],
                                @column_names[2] => columns[2] }
                end
            }
        event.set("events", events)
        '
        remove_field => [ "message" ]
    }
    split { field => "events" }
}

You would use the same "if [type] !=" to decide which output to write to.

If you want the columns at the top level then see this.

Thank you so much Badger, a bit complex for a newbie.

Do you need to define [type] in input?

clone { clones => [ "section2" ] }
if [type] != "section2" {

It's a bit confusing for me the use of [type] and clones this way. What is exactly doing?

Best regards

The clone filter creates one or more clones of an event. In this example it creates one clone and sets the type to "section2".

If you used

clone { clones => [ "section1", "section2" ] }

you would get 3 events, and the clone filter would set [type] to "section1" on one of them, and "section2" on the other. The clone filter would not set [type] on the third one.

Thank you so much Badger. So you apply filters regarding the clone labeled events and route the matches of each labeled events through the corresponding output index.

This line it's not mapping the date into @timestamp

grok { match => { "message" => "^DATE: %{DATA:[@metadata][date]} }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.