How to split an event to multiple events?

I'm trying to parse zookeeper wchc output using logstash:

0x167892507c74d32
/zookeeper/cluster_name/stores/31/alive
/zookeeper/cluster_name/stores/24/alive
/zookeeper/cluster_name/stores/32/alive

for each session id ( 0x0x167892507c74d32 ) there are multiple watches (/zookeeper/cluster_name/stores/xx/xxxx )listed below, each starting with a few spaces, and I wish to split this single event into multiple events, each with the following fields:

session_id="0x167892507c74d32" watch="/zookeeper/cluster_name/stores/24/alive"
session_id="0x167892507c74d32" watch="/zookeeper/cluster_name/stores/31/alive"
session_id="0x167892507c74d32" watch="/zookeeper/cluster_name/stores/31/alive"

I'm aware there is a "split" plugin in logstash which could help, but I don't know how to do it ,could anyone please offer a simple example of how to do this? any suggestion would be greatly appreciated!

I don't think split is the filter you should be looking to use here.
You may have to use an aggregate or multiline plugin instead that uses the session id as the start event and then you can a timeout for the last event.
The end results would be a single event but it would contain the session id and all of the corresponding watches for that session id.

Thank you for your reply !
After some digging into documentation I finally found the solution. Here is how I did it:

    grok {
      match => {
        message => "%{DATA:session_id}\n(?<watch>^(\s+.*)+$)"
      }
    }
    mutate {
      gsub  => [ "watch", "\n", "," ]
      gsub  => [ "watch", "\s", "" ]
      split => [ "watch", "," ]
    }
    split {
      field => "watch"
    }

After the first grok the event is like:

session_id=0x167892507c74d32
watch="/zookeeper/cluster_name/stores/31/alive     \n/zookeeper/cluster_name/stores/24/alive     \n/zookeeper/cluster_name/stores/32/alive"

Then I used mutate-gsub to replace "\n" to "," for the latter mutate-split,and to remove all spaces in the "watch" field, after mutate-split the event is like:

session_id=0x167892507c74d32
watch=["/zookeeper/cluster_name/stores/31/alive","/zookeeper/cluster_name/stores/24/alive","/zookeeper/cluster_name/stores/32/alive"]

Now I can use the split filter plugin to split this single event into multiple events:

event1:
session_id="0x167892507c74d32" watch="/zookeeper/cluster_name/stores/24/alive"

event2:
session_id="0x167892507c74d32" watch="/zookeeper/cluster_name/stores/31/alive"

event3:
session_id="0x167892507c74d32" watch="/zookeeper/cluster_name/stores/31/alive"
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.