Create arrays based on certain text in a field

kopacko · August 28, 2020, 6:57pm

I am parsing chat messages and I want to create three arrays for tag cloud purposes.

One for normal chat words, one for user tags, and one for chat emotes.

For example, take this chat message:
WE LOVE YOU @CHANDLER, YOUR GREAT AT EVERYTHING!! <3 <3 VirtualHug VirtualHug :) :WE LOVE YOU CHANDLER, YOUR GREAT AT EVERYTHING!! [emote=<3] <3 [emote=VirtualHug] VirtualHug [emote=:)] :)

Based on this message, I'd like to peel out the following :

user tag(s) : @CHANDLER
emotes : [emote=<3], [emote=VirtualHug], [emote=:)]

I have tried various examples I have found on these forums but nothing seems to work and always brings all of Logstash down.

Badger · August 28, 2020, 7:29pm

I explained how to do that in a reply to another of your posts. Did you have an issue with that solution? If so, what is the issue?

kopacko · August 28, 2020, 7:47pm

Can you better explain how to use this code?
ruby { code => 'event.set("matches", event.get("message").scan(/@\w+/))' }

I am reading a field called : channel_message. I am wanting to create fields : tag_cloud_emote, tag_cloud_chat, and tag_cloud_tags.

Badger · August 28, 2020, 8:41pm

If you have two things you want to scan for I would slightly change that code:

    ruby {
        code => '
            msg = event.get("channel_message")
            if msg
                event.set("tag_cloud_tags", msg.scan(/@\w+/))
                event.set("tag_cloud_emote", msg.scan(/\[emote=[^]]*\]/))
            end
        '
    }

which will get you

 "tag_cloud_tags" => [
    [0] "@CHANDLER"
],
"tag_cloud_emote" => [
    [0] "[emote=<3]",
    [1] "[emote=VirtualHug]",
    [2] "[emote=:)]"
],

Just insert that ruby filter into the filter section of your configuration file.

kopacko · August 28, 2020, 8:57pm

Thank you sir.

How do I account for just the chat words (that are not tags or emotes) ?

Badger · August 28, 2020, 9:07pm

You could try something like

mutate { add_field => { "tag_cloud_chat" => "%{channel_message}" } }
mutate {
    gsub => [
        "tag_cloud_chat", "\[emote=[^]]*\]", "",
        "tag_cloud_chat", "@\w+", "",
        "tag_cloud_chat", "  ", " "
    ]
}

Removing the extra spaces with a third gsub is just easier than trying to add spaces to the other gsubs and handling corner cases where those spaces do not exist.

kopacko · September 3, 2020, 3:08pm

Badger this appears to be working great.

How can I, while in the Ruby filter, remove the [emote=(capture_group)] from around the capture group?

kopacko · September 3, 2020, 3:34pm

Also, here is the current set of code I am using in Logstash. I had some field changes per other app requirements.

##########
# PARSE TAG CLOUD [CHAT]
##########
if [tag][cloud][chat] =~ /^.+$/ {
  mutate { gsub => [ "[tag][cloud][chat]", "(\[emote=\S+\])", "" ] }
  mutate { gsub => [ "[tag][cloud][chat]", "(@\S+)", "" ] }
  mutate { split => { "[tag][cloud][chat]" => " " } }
}

##########
# PARSE TAG CLOUD [EMOTE]
##########
if [channel][msg][text] =~ /(?i)\[emote=\S+\]/ {
  ruby {
    code => '
      msg = event.get("[channel][msg][text]")
      if msg
        event.set("[tag][cloud][emote]", msg.scan(/\[emote=[^]]*\]/))
      end
    '
  }
  if [tag][cloud][emote] =~ /(?i)\[emote=\S+\]/ {
    mutate { gsub => [ "[tag][cloud][emote]", "\[emote=(\S+)\]", "\1" ] }
  }
}

##########
# PARSE TAG CLOUD [TAG]
##########
if [channel][msg][text] =~ /(?i)@\w+/ {
  ruby {
    code => '
      msg = event.get("[channel][msg][text]")
      if msg
        event.set("[tag][cloud][tag]", msg.scan(/@\w+/))
      end
    '
  }
}

What are you thoughts of this code?

Badger · September 3, 2020, 3:46pm

Change the second scan to be

event.set("tag_cloud_emote", msg.scan(/\[emote=([^]]*)\]/).flatten)

kopacko · September 3, 2020, 3:54pm

Thank you sir.

Just noticed something a little odd with the output in a Kibana visualization.

Tag             Count 
@TimTheTatman   255
@Asmongold      214
@timthetatman   105
@DrLupo         84
@NICKMERCS      75
@nickmercs      52
@drlupo         50

I guess I need to have the ruby filter force all matches to lowercase? If so, how?

Badger · September 3, 2020, 4:09pm

You can use mutate+lowercase to do it. If a field is an array it will iterate over the members.

kopacko · September 3, 2020, 4:29pm

That did the trick. Thank you again sir!

system · October 1, 2020, 4:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash: filter to convert string to array of values Logstash	1	2325	November 6, 2017
Multiple Match Array Logstash	7	440	September 22, 2020
Set "fields" into array - SOLVED Logstash	4	9121	March 6, 2017
Create arrays in Logstash Logstash	5	2789	March 1, 2021
Create field with condition while looping through an array in ruby Logstash	4	18	July 25, 2024

Create arrays based on certain text in a field

Related topics