Logstash multiline with incrementation


#1

Hi everyone,

I'm working on log files and I have a little problem, this is an example of my logs :

192.168.0.1(localhost) (user1)
file1

192.168.0.2(localhost) (user2)
file2
file3
file2 - read only

A new line is identified by an IP address, for each line I want to keep the following information :

  • IP address
  • domain
  • user
  • file

For the first line, it isn't very complicated because I have only 1 file but for the second line I have a problem, I can get only the first file (file2). At the end, I'd like to have the following data into Elasticsearch :

192.168.0.1 localhost user1 file1
192.168.0.2 localhost user2 file2
192.168.0.2 localhost user2 file3

I don't want to keep the lines that don't begin by an IP address or that contain a dash '-', in my example I don't want to keep : file2 - read only.

This is the configuration file that I use for Logstash :

input {
    file {
        type => "my_dashboard"
        path => "/path-to-my-data/*"
        start_position => "beginning"
        sincedb_path => "since_db"
        codec => plain { charset => "ANSI_X3.4-1968"}
    }
}
filter {
    multiline {
        pattern => "%{IP:IP}"
        what => "next"
    }
    grok {
        match => {"message" => "%{IP:IP}\(%{HOSTNAME:domain}\)%{SPACE}\(%{USERNAME:user}\)%{SPACE}%{NOTSPACE:file}"}
    }
    mutate {
        remove_field => ["host","@version","path"]
    }
}
output {
    if "_grokparsefailure" in [tags] {
        file {
            path => "./grokparsefailure.log"
        }
    }
    else if [type] == "my_dashboard" {
        elasticsearch {
            hosts => "192.168.0.1:9200"
            index => "user_data"
            document_type => "user"
        }
    }
}

Thank you in advance for your help :wink:


(Magnus B├Ąck) #2

You need to reverse the logic. Join with the previous line unless it's an IP.

multiline {
  pattern => "^%{IP}"
  what => "previous"
  negate => true
}

#3

I tried your suggestion and I get the following results :

192.168.0.1    localhost     user1     file1
192.168.0.2    localhost     user2     file2 file3 file2 - read only

The expected results should be :

192.168.0.1    localhost     user1     file1
192.168.0.2    localhost     user2     file2
192.168.0.2    localhost     user2     file3

The problem is I don't know in advance how many files will be under each IP line. For now, I can only get the first file under each IP :

192.168.0.1    localhost     user1     file1
192.168.0.2    localhost     user2     file2

For each file line, I would like to add IP, domain and user to output in Elasticsearch.


(system) #4