Hi All,
I am trying to figure out best way to do event classification.
In my case, below is sample event:
Jul 21 01:19:57.58 172.20.20.100 date=2016-07-20 time=20:12:25 timezone="UTC" device_name="X2123" device_id=C032302-2323 log_id=98393783 log_type="Event" log_component="GUI" log_subtype="Admin" status="Successful" priority=Notice user_name="admin" src_ip=172.31.1.1 dmsg="Appliance Access Settings were changed by 'admin' from '10.0.0.26' using 'GUI'"
I would like to classify events based on content of dmsg. I have almost 1000 events each with unique content in dmsg. I have written 1000 grok pattern for each such unique message. Below is one of the example:
PID107 Appliance Access Settings were changed by %{QSSTRING} from %{QSSTRING} using %{QSSTRING}
Another example:
PID144 Service %{QSSTRING} was started by %{QSSTRING} from %{QSSTRING} using %{QSSTRING}
I have created configuration file like:
if "unclassified" in [tags] and [dmsg] {
grok { patterns_dir => ["/etc/logstash-indexer/patterns"]
match => { "dmsg" => "%{PID107}" }
add_field => { "eventid" => "PID107" }
remove_tag => "unclassified"
}
}
if "unclassified" in [tags] and [dmsg] {
grok { patterns_dir => ["/etc/logstash-indexer/patterns"]
match => { "dmsg" => "%{PID144}" }
add_field => { "eventid" => "PID144" }
remove_tag => "unclassified"
}
}
if [eventid] {
translate {
field => "eventid"
destination => "eventtype"
dictionary_path => "/data/eventtype.csv"
fallback => "unknown"
}
}
Above works perfectly fine. I have added remaining 1000 grok pattern match filters like:
grok { patterns_dir => ["/etc/logstash-indexer/patterns"]
match => { "dmsg" => "%{PID1000}" }
add_field => { "eventid" => "PID1000" }
remove_tag => "unclassified"
}
I am stuck here!
Logstash takes 5-6 minutes to load configuration and then it gives error:
/opt/logstash/bin/logstash -f /etc/logstash-indexer/conf.d/ -t
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /opt/logstash/heapdump.hprof ...
Unable to create /opt/logstash/heapdump.hprof: File exists
Error: Your application used more memory than the safety cap of 1G.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace
I can increase memory but is this the best way to classify 1000+ events? Any suggestions with better approach?
Thank you.
Jay