What is the purpose of type field in Input section


(Gaurav Dalvi) #1

Hello All,

Whats the use of "type => " in Input section of Logstash ? Anyways I will be using grok filter in filter section to filter incoming messages ?

Thanks,
gaurav


(Magnus Bäck) #2

The type option sets the value of the field with the same name. If you only ingest a single kind of log (and never will do anything else) you don't have to worry about it, but in all likelihood you'll eventually want to process different kinds of logs and then the type field will be a good way of distinguishing them.


(Gaurav Dalvi) #3

Thanks.

Problem is :
I have data center where i have let us for e.g 3 networking providers.
A B C. All of them has slightly similar syslog format. (some of them has extra spaces some of them has not, some of them has program name and severity in that some of them are not)

How can I set type filed in Input section prior seeing their actual syslog message ?

Ideally it should be like :
input {
type=> A

type=> B

type=> C
....
}

Thats why I dont understand the purpose of type field in input section.


(Mark Walkom) #4

How are you sending the logs to LS - TCP, file, syslog?


(Gaurav Dalvi) #5

I doing POC here. So I ask logstash to do this :

input {
tcp {
port => 5000
}
udp {
port => 5000
}
}

Then I do telnet localhost 5000 from same host machine and then insert manually syslog message which I have.

In production, I will make sure all network devices to publish syslog messages on some port and then logstash will listen to that port so that It will have continuous stream of syslog messages.


(Magnus Bäck) #6

I don't see why you'd have to use different types for different kinds of syslog messages. When you're searching for messages in Kibana, why should your query be affected by the vendor of the device producing the events you're interested in?


(Gaurav Dalvi) #7

Thanks. My requirement is :

Say, I have 3 Networking routers- A B C which are producing syslog messages. Their syslog messages come to logstash instance at any time and at any order. How can I search in Kibana , if I want to know how many messages from Vendor A I have got during certain time frame.


(Magnus Bäck) #8

You could indeed use different types or this, I just suspect there are better criteria. Wouldn't e.g. the hostname be a more interesting condition?

Anyway, to set different types depending on what the message looks like, something like this would work:

filter {
  if [message] =~ /regexp that matches vendor A/ {
    mutate {
      replace => { "type" => "syslog_A" }
    }
  }
}

Another way which I'd probably use is having multiple grok filters that match each vendor's messages—I'm just assuming they have different formats—to avoid regexp duplication and add a vendor-specific tag that you later on can translate into a change of the type field.

filter {
  grok {
    match => ...
    add_tag => ["syslog_A"]
  }
  if "_grokparsefailure" not in [tags] {
    grok {
      match => ...
      add_tag => ["syslog_B"]
    }
  }
  ...
  if "syslog_A" in [tags] {
    mutate {
      replace => { "type" => "syslog_A" }
      remove_tag => ["syslog_A"]
    }
  }
}

(Gaurav Dalvi) #9

Excellent. Thanks a lot for valuable suggestion. Appreciate your help.


(Gaurav Dalvi) #10

How Can I multiple match => statements within grok filter ?
e.g:
First line :
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp}\s+%{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }

One second line I have to reuse what I parsed in first line.
I want to do again like this :
match => {"syslog_hostname"=> some REGEX}

How can I do that in Logstash config.


(Magnus Bäck) #11

If you want to try multiple grok expressions on a field and break on the first match:

grok {
  match => {
    "fieldname" => ["expression1", "expression2"]
  }
}

If you want to match against different fields it might work to have multiple match options in the same filter, but I'd use two consecutive filters.


(Gaurav Dalvi) #12

Thanks for reply.
Problem is simple.
1: I need to parse hostname from syslog string.
I am able to do that with match expression.
2: With parsed hostname, I need to apply regex to find out which vendor it is. Assume that we name sysloghost name in certain way so that it contains vendor name.
for eg. hostname = region.city.vendor_name.company.com

How can I parse and get vendor_name in this case ?


(Magnus Bäck) #13

Just use a separate grok filter for that, somewhere after the current filter that extracts the hostname field.


(Gaurav Dalvi) #14

I did this and I am NOT able to see MONTH, DAY, and TIME in my output of logstash.

filter {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp}\s+%{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }

 match => { "syslog_timestamp" => "%{MONTH:month} +%{MONTHDAY:day} %{TIME:time}" }      

  remove_field => ["@version", "host", "message", "port"]
}

date {
  target => "@timestamp"
  match => [ "syslog_timestamp",
             "MMM  d HH:mm:ss",
             "MMM dd HH:mm:ss" ]
  timezone => "UTC"
}

}


(Magnus Bäck) #15

Did you try my suggestion of having two consecutive filters instead of two match options in the same filter?


(Gaurav Dalvi) #16

Thanks It worked.
Could you please take your valuable time and try to reply to my question here ?

I appreciate your help.


(Gaurav Dalvi) #17

Thanks. I tried this but I think there is problem here.
Let us say,
If I pass syslog_B type log first,

Then it goes to first grok block and then as It doe not match , It puts _grokparsefailure into tags. and then it goes to following if block and then I get match. So for syslog_B type message, I always get two tags one is failure one and one is syslog_B one.

How do I avoid this ? I can do this by following your first option but , second one more elegant and right :slight_smile:

Thanks,
Gaurav


(Magnus Bäck) #18

If you don't want the _grokparsefailure tag you can remove it in each grok filter with remove_tag. That option is only triggered when the grok is successful, just like add_tag. Then you won't get a _grokparsefailure tag in the end if one of the grok expressions matched, but if none of them matched the tag will be there.


(system) #19