I have some data coming into Logstash that has an email address. What I want to do is create a new column with just the domain. This will allow us to better look for trends and issues by domain.
None of the existing Grok patterns seem to be of use so I created our own. The problem is that I can't get a clean domain name. I can get "@gmail.com", but I can't just get "gmail.com".
A really simple regex pattern is: @(...). In other languages I can specify which captured group to return. That way I can ignore the @ character and just return everything inside the parentheses. However in searching I haven't found a way in Grok to do that. It wants to return everything I have also tried using a non-captured group: (?:@)(...) but it returns the same results.
Has anyone ever encountered that before? Or know of a better way to write the regex so that it isn't needed?
Here is the conf file.
input {
stdin{
codec => json_lines
}
}
filter {
grok{
match => ["Email","(?(?:@)(...))"]
}
}
output {
file {
path => "/etc/logstash/results.txt"
}
}
I pass the following into Logstash
{"Email":"brandon@gmail.com"}
The output file shows:
{"Email":"brandon@gmail.com","@version":"1","@timestamp":"2016-02-10T18:09:04.043Z","host":"brandon-VirtualBox","domain":"@gmail.com"}
So what is the best way to have it exclude the @ character?