Logstash - Extracting substring from CSV column

AshishC · December 19, 2017, 4:21pm

Hi,

I have a csv file with a column called "threadName", value of which varies with each record in csv. example

CME_MOC 15-1
CME_MOC 15-2
CME_MOC 15-3
PME_MOC 15-1
KME_MOC 15-2

I am sending csv records to elasticsearch using below logstash conf:

csv {
separator => ","
columns => ["time", "elapsed", "threadName", "success", "IdleTime","Connect"]

But I want to extract "threadName" column and send below substring:

CME_MOC
CME_MOC
CME_MOC
PME_MOC
KME_MOC

Do I need to add a new field and use grok ? how can I achieve this

Many Thanks in advance
Ashish

Badger · December 19, 2017, 4:49pm

You could grok or dissect.

dissect {
      mapping => { "threadName" => "%{part1} %{part}" }
}

AshishC · December 19, 2017, 5:02pm

Thanks Badger, qq- where should I place the dissect command -

filter {
if ([message] =~ "responseCode") {
drop { }
} else {
dissect {
mapping => { "threadName" => "%{part1} %{part}" }
}
csv {
separator => ","
columns => ["time", "elapsed", "label", "responseCode","responseMessage", "threadName",
"success", "bytes","sentBytes", "grpThreads", "allThreads", "Latency",
"SampleCount", "ErrorCount", "Hostname","IdleTime","Connect"]
}
}
}

Badger · December 19, 2017, 6:34pm

The dissect{} has to come after the csv{}, otherwise the threadName field does not exist. Filters are executed in the order listed in the configuration.

AshishC · December 19, 2017, 6:47pm

Hi Badger, Thanks again. For now I am using below grok

grok {
match => [""threadName", "%{USERNAME}"]
}

I will explorer more on dissect, but could please have a quick glance and see if below line does the same thing as grok ?

dissect {
mapping => { "threadName" => "%{part1}" }
}

Thanks for your help today

AshishC · December 19, 2017, 6:57pm

I guess dissect will create a new field whereas grok keep the same field with new extracted value.

ex-
grok {
match => [""threadName", "%{USERNAME}"]
}

Here threadName field will have new value i.e CME_MOC

dissect {
mapping => { "threadName" => "%{part1}" }
}

But here CME_MOC will be stored in new field name- part1

am I right here?

Badger · December 19, 2017, 7:45pm

Don't guess, test it Run logstash with a config like this and then type something like "CMS_MOD 15-3" into stdin.

input { stdin {} }
output { stdout { codec => rubydebug } }

filter {
 # So we can inject stuff like "PME_MOC 15-1" on stdin instead of needing a csv
 mutate { "add_field" => { "threadName" => "%{message}" } }

 # Split into 2 fields with space as separator
 dissect { mapping => { "threadName" => "%{part1} %{part2}" } }

 # No separator, so it grabs the whole thing
 dissect { mapping => { "threadName" => "%{part3}" } }

 # Match the first [a-zA-Z0-9._-]+ in the field and throw it away
 grok { match => ["threadName", "%{USERNAME}"] }

 # Match the first [a-zA-Z0-9._-]+ in the field and put it in the username field
 grok { match => ["threadName", "%{USERNAME:username}"] }

 # Match the first [a-zA-Z0-9._-]+ in the field, anchored to optimize performance 
 grok { match => ["threadName", "^%{USERNAME:username2}"] }
}

If you save that as /tmp/test.conf then you can probably run logstash using

/usr/share/logstash/bin/logstash -f /tmp/test.conf --path.settings=/etc/logstash --path.data=/tmp

AshishC · December 19, 2017, 7:51pm

sure Badger, you were very helpful. Really appreciate your time and sharing the needed info.

system · January 16, 2018, 7:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash- extraction of data from a column in csv Logstash	5	738	March 16, 2017
Grok after CSV filter Logstash	13	5222	September 6, 2017
How to get prefix of a string Logstash	3	516	August 13, 2018
Extract the value from CSV field and add new field Logstash	7	3619	July 19, 2017
Extract data from CSV column Logstash	4	321	August 1, 2021

Logstash - Extracting substring from CSV column

Related topics