I am now using logstash but looks like it is consuming high CPU usage.
Below is part of lines from my config.
input {
file {
path => ["/var/log/logstash/data/test/test1/*/always/*"]
start_position => "beginning"
codec => multiline {
pattern => "----------------------\+|Present Rot \[360deg\]|Total Rot \[360deg\] "
negate => false
what => previous
}
ignore_older => 31104000
tags => ["arm"]
}
}
output {
if "RIGHT HAND" in [tags] or "LEFT HAND" in [tags] {
elasticsearch {
hosts => ["127.0.0.1"]
index => "test-%{+YYYY.MM.dd}"
document_type => "arm"
}
stdout { codec => rubydebug }
}
}
filter {
# Set the date
if [message] =~ "\*\*\* \d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} \*\*\*" {
grok {
match => {
"message" => "\*\*\* (?<date>\d{4}-\d{2}-\d{2}) \d{2}:\d{2}:\d{2} \*\*\*"
}
}
ruby {
init => "
@@date = ''
"
code => "
@@date = event.get('date')
"
}
}
#########################################################
# Filter stanza for arm parsing
#########################################################
if "arm" in [tags] and "multiline" in [tags] {
# Set tags for right or left arm
if [message] =~ ".*RIGHT HAND.*" {
mutate {
add_tag => [ "RIGHT HAND" ]
}
} else if [message] =~ ".*LEFT HAND.*" {
mutate {
add_tag => [ "LEFT HAND" ]
}
}
if "RIGHT HAND" in [tags] or "LEFT HAND" in [tags] {
# Parsing comes first
grok {
# Parse the parts values
match => {
"message" => ".*\[(?<logged_time>\d{2}:\d{2}:\d{2})\.\d{6}\]\s+Total\s+Rot\s+\[360deg\]\s+\|\s+(?<f11_val>\d+)\|\s+(?<f12_val>\d+)\|\s+(?<f13_val>\d+)\|\s+(?<f21_val>\d+)\|\s+(?<f22_val>\d+)\|\s+(?<f23_val>\d+)\|\s+(?<f31_val>\d+)\|\s+(?<f32_val>\d+)\|\s+(?<f33_val>\d+)\|\s+(?<f41_val>\d+)\|\s+(?<f42_val>\d+)\|\s+(?<f51_val>\d+)\|\s+(?<f52_val>\d+).*"
}
# Create a new field to split
add_field => {
"arm_parts_and_val" => "f11_%{f11_val},f12_%{f12_val},f13_%{f13_val},f21_%{f21_val},f22_%{f22_val},f23_%{f23_val},f31_%{f31_val},f32_%{f32_val},f33_%{f33_val},f41_%{f41_val},f42_%{f42_val},f51_%{f51_val},f52_%{f52_val}"
}
# Remove unnecessary fields
remove_field => [
"message" ,
"f11_val" ,
"f12_val" ,
"f13_val" ,
"f21_val" ,
"f22_val" ,
"f23_val" ,
"f31_val" ,
"f32_val" ,
"f33_val" ,
"f41_val" ,
"f42_val" ,
"f51_val" ,
"f52_val"
]
}
# Next split the event
split {
# Tell the split plugin for which field to split
field => "arm_parts_and_val"
# Define the terminator . Comma in this case
terminator => ","
# Define a new field to set the new values
target => "new_arm_parts_and_val"
}
# Grok again to split the "new_arm_parts_and_val"
grok {
# Split "new_arm_parts_and_val" into parts name and value
match => {
"new_arm_parts_and_val" => "(?<parts_name>\w\d+)_(?<total_rot>\d+)"
}
# Remove rest of fields
remove_field => [
"arm_parts_and_val",
"new_arm_parts_and_val"
]
}
I am considering to tune the config file but not sure where to start from.
Is there a know issue that any of the plugin consumes high cpu usage?
HI,
First I would recommend doing a bit of profiling, to do it you could take a few thread dumps and see where the cost is, one very good article on this tools is http://www.ibm.com/developerworks/library/j-5things8/ for example.
However with your config, If I had to have one suspicion, here for high cpu performance would it be the multiline codec, so that would be my first hypothesis.
When you get this data if you share we can sort of the offender and sort out the problem more in deep.
PD: Knowing how your multiline files looks like would it also be necessary and helpful.
Is your log with or without the timestamp part? I don't see this in the pattern, so might be one reason to it might get nuts, will debug a bit more and report here if I find the reason.
After we discarded all other situations, so is definitely an issue with the filters. Can you share here a few thread dumps and/or the jstack output. We're looking to get an impression which threads are running, so we can spot the filter causing this high cpu usage. I see you use a lot of grok who can be one of the reasons of this behaviour, but we'll need this information to move forward.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.