Logstash filter to extract email and credit card

I am new to logstash and want to extract email IDs and credit card details from files. I am using filebeat to pass log files to logstash and also have setup a config file with the grok patterns. However, I am not able to extract either emailID or credit card. I get this error:

 "tags" => [
    [0] "beats_input_codec_plain_applied",
    [1] "_grokparsefailure"

Given below is my config file (the pattern works in the grokdebug app).

input {
      beats {
        port => 5044
        host => "0.0.0.0"
    }
}

filter {
    if [message] =~ "(?<ccNumber>\d{4}-\d{4}-\d{4})" {
        mutate { add_tag => ["ccNumber"] } 
    }
    else if [message] =~ "(?<emailID>[a-zA-Z0-9_.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:\.(?:[0-9A-Za-z][0-‌​9A-Za-z-]{0,62}))*)" {
        mutate { add_tag => ["emailID"] }
    } 
    else {
        grok {
            match => {
                "message" => '%{HTTPD_COMMONLOG} "%{GREEDYDATA:referrer} %{GREEDYDATA:agent}"'
            }
        }
    }

    mutate {
        convert => {
            "response" => "integer"
            "bytes" => "integer"
        }
    }
}

output {
    stdout {
        codec => rubydebug
    }

    file {
        path => "logs\output.txt"
    }
}

Can you provide log message sample ?

You are trying to apply HTTPD_COMMONLOG filter on the message and this pattern is probably causing the error as it is not matching.

Here is a sample log message:

109.169.248.247 - - [12/Dec/2015:18:25:11 +0100] "GET /administrator/ HTTP/1.1" 200 4263 "-" "Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0" "-"
177.201.52.125 - - [13/Dec/2015:14:45:49 +0100] "test@test.com"
177.201.52.125 - - [13/Dec/2015:14:45:49 +0100] "1234-5678-9876"

Hi,

Your grok pattern is not the right one for the request, have a look at https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html

Hi,

Thanks for your response. I went through the documentation and tried several options. But none of them worked. What I want is to extract email address, credit card along with IP address and other details from the Apache log. Can you please help? My filter is as follows (to extract all details if they exist in one line):

filter {
	grok {
		match => {
			"message" => [
				"APACHE LOG: '%{IP:ip_address} %{USER:identity1} %{USER:identity2} \[%{HTTPDATE:req_ts}\] \"%{WORD:http_verb} %{URIPATHPARAM:req_path} HTTP/%{NUMBER:http_version}\" %{INT:http_status:int} %{INT:num_bytes:int} (?<ccNumber>\d{4}-\d{4}-\d{4})'",
				"ccNumber: (?<ccNumber>\d{4}-\d{4}-\d{4})",
				"emailID: (?<emailID>[a-zA-Z0-9_.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:\.(?:[0-9A-Za-z][0-‌​9A-Za-z-]{0,62}))*)"
			]
		}
	}	
}

How do I get all details from a single line and if all details are not present, then how do I get just the credit card or email address or the IP address? Below filter also did not work (to extract one detail at a time if they exist):

filter {
	if [message] =~ "(?<ccNumber>\d{4}-\d{4}-\d{4})" {
		grok {
			match => {
				"message" => '%{IP:ip_address} %{USER:identity1} %{USER:identity2} \[%{HTTPDATE:req_ts}\] \"%{WORD:http_verb} %{URIPATHPARAM:req_path} HTTP/%{NUMBER:http_version}\" %{INT:http_status:int} %{INT:num_bytes:int}'
			}
			add_tag => ["ccNumber"]
		}
	}
	else if [message] =~ "(?<emailID>[a-zA-Z0-9_.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:\.(?:[0-9A-Za-z][0-‌​9A-Za-z-]{0,62}))*)" {
		grok {
			match => {
				"message" => '%{IP:ip_address} %{USER:identity1} %{USER:identity2} \[%{HTTPDATE:req_ts}\] \"%{WORD:http_verb} %{URIPATHPARAM:req_path} HTTP/%{NUMBER:http_version}\" %{INT:http_status:int} %{INT:num_bytes:int}'
			}
			add_tag => ["emailID"]
		}
	} 
	else {
		mutate {
			add_tag => ["TEST123"] 
		}
	}
}

Hi,

I compared my grok patterns with that in the documentation several times and nothing seemed out of place. But what troubled me was the output from the log files (they were encoded in a different format). I had been editing the log files in notepad++. So, I decided to process an unedited log file which worked. Opening the log file in notepad, making minor changes to them and running them through grok now worked. Also, adding a new line should be done via a command line (i.e. outside an editor). I hope this helps someone with the same issue.

Thanks,
Ananth.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.