Indexing multiple synonym values as keywords

(Emanuele Verga) #1

Hi all,

we are using elasticsearch to index firewall logs from multiple vendors, so we end with different keywords for the same value/event (es permit and permitted, Deny and denied and many other permutations, etc)

I am now looking into the best way to unify those values across the board to then be able to use them in dashboards and the like.
Synonyms look like the way to go (to me, although I'm open to suggestions)
The catch though is that we would need the values to be keyword and not text, as we actually filter and would like to aggregate them.

I have tried to define a synonym filter in a normalizer, but I get an error saying that the filter is not supported (I suppose because, depending on configuration, it could return multiple values)
Is there any workaround?

I created a custom analyser with a keyword tokenizer and a synonym filter, but is not an option either because it can only be used on text fields..
If at all possible I would prefer to avoid multi fields because of the disk space waste

Am I focusing on synonyms too much? Is there an alternative/better solution?

Other options from my understanding are:

  • Replace the values in the fields logstasth before ingestion (seems difficult to maintain, although I may be wrong. we already use logstash for ingestion)
  • Create an elastic ingestion pipeline using the set processor (although it doesnt allow any conditional and we have multiple values for same fields, so its probably a no go)
  • Create an elastic ingestion pipeline using the script processor (never tried to use painless, not sure how much effort it would require)

Any input is appreciated.

(Adrien Grand) #2

If you already use logstash for ingestion, this looks like the natural place to do this to me. The title of the topic suggests you'd like to index multiple values but I think it would be best to just settle on one (eg. deny) and replace all synonyms (denied, etc.) with it.

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.