Different inputs logs with logstash

I am trying to parse different logs with logstash . Right now i have only one log file coming from multiple servers at logstash . Parsing working OK . But i am planning to add multiple logs ( syslogs , apache , windows , nginx etc ) and send it to different indexes on elasticsearch .
Can someone suggest .

  1. how can i add different inputs and filter and send to different indexes
  2. do i need to have different configs , how do i do that
  3. how about the load on logstash cluster , what is recommended

Below is my logstash config

input {
    beats {
    client_inactivity_timeout => 86400
    port => 5044
     }
}
filter {
    mutate {
    gsub => [
      "message", "\t", " ",
      "message", "\n", " "
    ]
  }
    grok {
    match => { "message" => "\[%{TIMESTAMP_ISO8601:timestamp_match}\]%{SPACE}\:\|\:%{SPACE}%{WORD:level}%{SPACE}\:\|\:%{SPA
CE}%{USERNAME:host_name}%{SPACE}\:\|\:%{SPACE}%{GREEDYDATA:coidkey}%{SPACE}\:\|\:%{SPACE}%{GREEDYDATA:clientinfo}%{SPACE}\:
\|\:%{SPACE}(%{IP:clientip})?%{SPACE}\:\|\:%{SPACE}%{GREEDYDATA:Url}%{SPACE}\:\|\:%{SPACE}%{JAVACLASS:class}%{SPACE}\:\|\:%
{SPACE}%{USER:ident}%{SPACE}%{GREEDYDATA:msg}"}
   remove_field => [ "ident","offset","name","version","host" ]
   }
}
output {
    stdout { codec => rubydebug }

  if "_grokparsefailure" in [tags] {
    # write events that didn't match to a file
    file { "path" => "/tmp/grok_failures.txt" }
  } else{
     elasticsearch {
       hosts => "dfsyselastic.df.jabodo.com:9200"
       user => "UN"
       password => "PW"
       index => "vicinio-%{+YYYY.MM.dd}"
       document_type => "log"
     }
   }
}

If you have a way to tell which log message you are receiving then you will be able to specify what the "type" is. Then in the filter section you'd be able to write something along the lines of:

filter{
if [type] == "syslogs"{
grok{}
}
if [type] == "apache"{
grok{}
}
.....etc.....
}

Then for storing them to different indices

output {
    stdout { codec => rubydebug }

  if "_grokparsefailure" in [tags] {
    # write events that didn't match to a file
    file { "path" => "/tmp/grok_failures.txt" }
  } 
if [type]== "syslogs"{
     elasticsearch {
       hosts => "dfsyselastic.df.jabodo.com:9200"
       user => "UN"
       password => "PW"
       index => "syslogs-vicinio-%{+YYYY.MM.dd}"
       document_type => "syslog"
     }
   }
if [type]== "apache"{
     elasticsearch {
       hosts => "dfsyselastic.df.jabodo.com:9200"
       user => "UN"
       password => "PW"
       index => "apache-vicinio-%{+YYYY.MM.dd}"
       document_type => "apache"
     }
   }
}

As for the load on logstash cluster how many logs are you expecting and in what amount of time? (i.e. 100 logs per minute, hour, second, millisecond?)

1 Like

Thanks .. i will try it out .

Also i have

    input {
        beats {
        client_inactivity_timeout => 86400
        port => 5044
         }

Will it be able to identify at the same port or do i have to use different ports for different inputs. Quite new to this. :grin:

I do not think it will be able to identify it at the same port unless there is some way of determining where it was sent from. I have not used the beats plugin before. But I do have experience working with tcp, udp and file input plugins. For the tcp and udp there is a field called host which states the IP address of the server that sent the message. Since each server has unique, static IP address I will be able to determine where/which type it is. I am unsure if Beats has this field. If it does then you do not need to have the

if [type] == "" .....

and you can say:

if "host" == "some indicator" {}

If there is no way to uniquely identify it using one port. Then you can have each server send their logs to a different port and you would be able to uniquely identify them based off of what port they came from.

Make sense?

Thanks again . I will try few things as per your advise.

Regarding request ...the log coming to logstash cluster is gonna be approximately 90-100K per minute or may be more . I have 4 logstash servers to ditribute the load . Here is filebeat config

#=========================== Filebeat prospectors =============================

filebeat.prospectors:

- input_type: log

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /archives/logs/tomcat7-8080/download.log
    - /archives/logs/tomcat7-8090/download.log
  tail_files:  true
  multiline.pattern: '^\[[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after

#================================ Outputs =====================================
#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
#  hosts: ["lvsyslogstash1.lv.jabodo.com:5044"]
  hosts: ["lvsyslogstash1.lv.jabodo.com:5044","lvsyslogstash2.lv.jabodo.com:5044","lvsyslogstash3.lv.jabodo.com:5044","lvsy
slogstash4.lv.jabodo.com:5044"]
  loadbalance: true
  worker: 2
#  filebeat.publish_async: true

I do not think that there will be a problem with this load. Especially if it is distributed evenly across 4 logstash servers. I am not 100% certain though.

Thanks again. I will try few things .

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.