ML for unusual name detection


We are trying to find hostnames in the DHCP logs, that are roque in the sense that the hostname is not following current naming standards.

So I have been trying to figure out, how to do that with ML.

So I came up with an idea to collapse the hostname into patterns. So I preprocess all the names like this:

change all upper case chars to 'A',
change all lowercase to 'a'
change all digits to '0'
all special char to '.'.

This gives a uniform pattern of the names.

I then used the rare function in a ML job to run through all the DHCP leases and I actually got some pretty decent results with this approach.

But is the correct way of detecting stuff like this or is there a simpler way , where I dont have to preprocess all the names myself ?

Best regards

Your approach is clever and is likely the best way to do it.


Thanks Rich , appreciate it.

I am sure , some more recipes for stuff like this would be very welcomed by the community. We have many usecases, where we would like to try ML , but quite unsure on how to proceed.


If your hostnames are following a fixed pattern could you not create a regex and used that in the ML job query? All that do not match would that not be rouge clients on the network..

I do like the use case..

