*beat -> logstash -> logstash

I have been playing with ELK stack for a month or so and seem to have it working quite well at a single site, but I am running into issues of how to move forward.

I have several sites with logstash that I would like to feed into another logstash/elasticsearch at a central site. This would allow logs to be maintained at a site for maybe 30 days with no backups and then logs maintained at a central location for a longer period of time before they are archived.

I don't believe that a elasticsearch cluster is what I am looking for, so I am playing with logstash outputs. Currently, at each site, I have logstash configured to output to both the local elasticsearch and lumberjack. That lumberjack output is being received at my central site in what appears to be the proper format (took me a little bit to realize that I needed codec => "json" to get it to do that). The problem is that it is not appearing in elasticsearch at my central site, only the local beats inputs are showing up.

Has anybody set something up like this before and can they assist me?

Thanks!

Terry

Providing some of your configs would he helpful :slight_smile:

Sorry about that. Here are the relevant files.

Upstream server:

input {
  beats {
    port => 5044
    ssl => true
    ssl_certificate => "/path/to/my/crt"
    ssl_key => "/path/to/my/key"
  }

  lumberjack {
    port => 2400
    codec => "json"
    type => "downstream"
    congestion_threshold => 40
    ssl_certificate => "/path/to/my/crt"
    ssl_key => "/path/to/my/key"
  }
}

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }

  if [type] == "downstream" {
    json {
      source => "message"
    }
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

Downstream server:

input {
  beats {
    port => 5044
    ssl => true
    ssl_certificate => "/path/to/my/crt"
    ssl_key => "/path/to/my/key"
  }

}

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    syslog_pri { }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

output {
  lumberjack {
    hosts => ["my.upstream.server"]
    port => 2400
    codec => "json"
    ssl_certificate => "/path/to/my/crt"
  }

  elasticsearch {
    hosts => ["localhost:9200"]
    sniffing => true
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

Does anybody have any idea where I am going wrong?

What if you add a stdout?

I have added a stdout previously. I compared entires that were added (topbeat going directly to the upstream server) with entries that were not added (topbeat coming from the downstream server) and they were identical. I see no reason why it wouldn't work.

Bump