Model used in anomaly detection

Guys, can you let me know when I use the following options to enable anomaly detection in Elasticsearch.
Analytics>Machine Learning>Anomaly Detection>Create job> Categorization
Which algorithm/model would the implementation be based on to categorize the messages (it is a log base), and which algorithm/model would the implementation be based on for anomaly detection when I select the categorization approach.


I needed information on that level. Ex: "Decision Tree for categorization and Feedforward Neural Networks for anomaly detection."
I tried to look at Github but couldn't check it accurately.

Thanks in advance

Please check this question. it will clarify most of all your question.

The methodology for Categorization is described here: Elastic Machine Learning Tips and Tricks - Categorization - YouTube

Basically, it uses an approach of:

  1. removing "mutable" tokens/words from the text (IP addresses, hostnames) by focusing on dictionary words (but this can be customized)
  2. Use an algorithm similar to Levenhstein distance to determine if strings of text are similar/close to strings of text seen before.
  3. If the current string is similar, put it into the same category, otherwise create a new category.
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.