Using hash table with filter Plugin

Hi.

I have deployed a syslog plugin that has a set of filters that help me tag each message to the vendor or device type based on regex patterns.
I am trying to avoid doing it per message, so I am thinking of something like hash table in memory, that will save an IP/hostname and the relevant tag. This case, each time a message received, I will first check the hash table and only if it not there, then I will have to go over the whole list of regex. Once it found, I will update the hash so next message from the same IP, will be easy to resolve this info using hash.

Does anyone knows if it’s possible and how?

Sounds like the translate filter would be useful.

1 Like

Magnus is dead on... I build a yaml file with a contrab job that looks in our inventory management system nightly. within 3 minutes of updating the yaml file logstash is dynamically aware of the new device. my config is:

  translate {
   field => "hostname"
   destination => "devtype"
   dictionary_path => "/opt/elk/logstash/translate/company.yaml"
  }
  if ("" in [devtype]) {
   grok {
    match => { "devtype" => "%{GREEDYDATA:devtype},%{GREEDYDATA:devloc}" }
    overwrite => [ "devtype" ]
   }
  } else {
   mutate {
    add_tag => "nodevtype"
   }
  }

cat /opt/elk/logstash/translate/company.yaml

"host-1": Cisco 3750,Office 1
"host-2": Cisco 3750,Office 2
"host-3": ASA Firewall,Office 3
"host-4": Wireless Controller,Office 1

so with this setup if a device is sending us syslogs that is not in our inventory system it gets tagged "nodevtype" and I can see that in my kibana dashboard.

HTH.

1 Like

Hi.

Still on the same subject (syslog filter)- I have a set of regex per device model/vendor. I am trying to think of a way to do it more effcient. Meaning, have a hash with two fields. One is the device/vendor (Cisco PIX, Juniper, CheckPoint, Unix OS ...) and the second field is the regex itself.
Now I want a loop to run on this hash and check each regex against a message. If there is a match, then it will tag the message with the first field value.
The use of hash will able me to, very easily, control the order of the regex. More specific regex/more common device types at first.

Is this something for Ruby code?

A grok filter can include multiple expressions which will be tried in order (first match wins). However, this won't allow you to add a tag that indicates which expression matched. If you really need that you can use a series of grok filters, for example like this:

grok {
  ...
}
if "_grokparsefailure" in [tags] {
  grok {
    ...
  }
}
if "_grokparsefailure" in [tags] {
  grok {
    ...
  }
}
...

Thanks Magnus.

I was trying to avoid this structure of Nested if then else and multiple grok's. It makes it very diffcult to change the order of RegEx pattern compare to hash table.
Seems like this is the only option.

I was trying to avoid this structure of Nested if then else and multiple grok's. It makes it very diffcult to change the order of RegEx pattern compare to hash table.

You need to move around five lines instead of one line to change the order of matching—is that really "very difficult"?

Seems like this is the only option.

Yes, if you really need to set different tags depending on which expression matched.

Hi Magnus.

When we are dealing with dozens or maybe hundreds of Groks filter (My case), it is very hard to maintain :frowning:

Perhaps you should generate your configs then. Then you can store and maintain the expressions any way you like, and if you need to change how the filters look you can just change the script or template that produces the config.

I agree, but not sure how. I have to loop the whole list of attribute-value pairs (vendor <> regEx) till I find a match.
Is this task for a Ruby code, or can I do it using an exisiting filter?

No, generate a static configuration file based on configuration data in whichever representation you prefer. If you use a configuration management system like Chef, Ansible, or Puppet (and you should!) this should be trivial.