Logstash filter for FTP logs

Hello Team,

I need to put filters for logstash for querying data coming from FTP servers. I havent worked on filters much, just have rough idea, so i created one for some below test logs:

Thu Jul  4 06:01:45 2019 [pid 43249] [xyz] OK DOWNLOAD: Client \"x.x.x.x\", \"/commonupdater/sitestat.xml\", 118 bytes, 2.64Kbyte/sec","popId":"1","hostIpAddress":"x.x.x.x","host":"ftp-1-2","data_field":"raw","type":"ftp-log
Thu Jul  4 06:20:14 2019 [pid 55668] [xyz] OK DOWNLOAD: Client \"x.x.x.x\", \"/commonupdater/sitestat.xml\", 118 bytes, 2.58Kbyte/sec","popId":"2","hostIpAddress":"x.x.x.x","host":"ftp-2-2","data_field":"raw","type":"ftp-log"
Thu Jul  4 06:20:13 2019 [pid 55666] [xyz] OK LOGIN: Client \"x.x.x.x\", anon password \"NcFTP@\"","popId":"3","hostIpAddress":"x.x.x.x","host":"ftp-2-3","data_field":"raw","type":"ftp-log"
Thu Jul  4 06:20:13 2019 [pid 55667] CONNECT: Client \"x.x.x.x\"","popId":"4","hostIpAddress":"x.x.x.x","host":"ftp-1-2","data_field":"raw","type":"ftp-log"
Thu Jul  4 06:20:11 2019 [pid 43201] CONNECT: Client \"x.x.x.x\"","popId":"5","hostIpAddress":"x.x.x.x","host":"ftp-2-4","data_field":"raw","type":"ftp-log"

In these logs line I need following filters, rest can be ignored:

  1. Status: OK DOWNLOAD, FAIL DOWNLOAD, CONNECT
  2. Client IP
  3. File Name: /commonupdater/sitestat.xml, etc.
  4. Size of file: In bytes
  5. Download rate: In bytes/sec

For this I created fiter pattern:

filter {
  grok {
    match => {"message" => "%{MONTH} +%{MONTHDAY} %{TIME} %{YEAR} (\[%{GREEDYDATA:pidno}\] )?(\[%{WORD:comp}\] )?(%{WORD:status} )?(%{WORD:download}:)?(%{WORD:client} )?(\"%{IPV4:ipaddr}\", )?(\"%{GREEDYDATA:filename}\", )?(%{GREEDYDATA:size} )?(%{GREEDYDATA:speed} )?"}
  }
    mutate {
      remove_field => [ "pidno", "comp", "download", "client" ]
    }
}

Thanks in Advance

Do not bother naming fields if you are just going to drop them. Instead of %{GREEDYDATA:pidno} you can just use %{GREEDYDATA}

Personally I would use dissect to take off the prefix to the line that is consistently formatted.

    dissect { mapping => { "message" => "%{[@metadata][ts]} %{+[@metadata][ts]->} %{+[@metadata][ts]} %{+[@metadata][ts]} %{+[@metadata][ts]} [pid %{}] %{[@metadata][restOfLine]}" } }
    grok { match => { "[@metadata][restOfLine]" => '(\[%{WORD}\] )?%{DATA:operation}: Client "%{IPV4:ipaddr}",' } }
    if "DOWNLOAD" in [@metadata][restOfLine] {
        grok { match => { "[@metadata][restOfLine]" => 'Client "%{IPV4}", "(?<filepath>[^"]+)", %{INT:filesize:int} bytes, %{NUMBER:rate:float}Kbyte/sec' } }
    }
    date { match => [ "[@metadata][ts]", "EEE MMM dd HH:mm:ss YYYY" ] }

Thanks Badger,

Is there any docs or some site to refer with and study more about this.

There's and old post about it, but you can take it as a reference (here)

I tried that one, but still same error.

What error? The set of filters I posted works against the set of data you posted.

After setting filter given by you, I cant see any logs coming to elasticsearch even though there are no error messages in logstash logs.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.