Beat input plugin and logs which are NOT sent via a beat publisher


#1

Hi all,

First post here.
I have a question regarding the beat input plugin used in the logstash pipeline.

Scenario follows:
Most of our (systemd) services are logging to stdout which is then picked up by journald.
We use journalbeat (https://github.com/mheese/journalbeat) to publish the logs to logstash with the beat input plugin followed by some JSON filtering and then sending the output to ES/Kibana.

input {
  beats {
      port => "5043"
  }
}

filter {
  if [type] == "journal" {
    json {
        source => "message"
        skip_on_invalid_json => true
    }
  }
}

That works fine.

The new addition are logs which are not in journald. For instance a secret management solution sends logs on a specific tcp port to logstash (not using journalbeat at all).

I tried to make it work with the beat input to no avail.
I got it working by adding a tcp input in the logstash config such as this:

input {
    beats {
        port => "5043"
    }
    tcp {
        port => "5045"
        codec => json
    }
}

Any explanation as to why it does not work with the beat input?
The logs were definitely sent and seen in tcpdump, but nothing in the logstash stdout output when the JSON logs was correct (same that works with tcp plugin).

The doc (https://www.elastic.co/guide/en/beats/libbeat/current/newbeat-overview.html) mentions that the event that is created by a beat publisher is a JSON-like object but that at a minimum, the event object must contain a @timestamp field and a type field.

Here is a sample of JSON log sent:

{
  "time": "2017-08-13T11:04:48Z",
  "type": "request",
  "auth": {
    "client_token": "hmac-sha256:ffcdfc9fddab5ca02e654d1098cfa587ca2ba861456fb7e1d062b1ca1c93454f",
    "accessor": "hmac-sha256:adfdc1bac3f61eb3542667df6edfd44525915fe4a208ae3e4c8b01044cc50c0d",
    "display_name": "root",
    "policies": [
      "root"
    ],
    "metadata": null
  },
  "request": {
    "id": "202a2b7e-0bfb-6520-d00a-fb07f78fee56",
    "operation": "read",
    "client_token": "hmac-sha256:ffcdfc9fddab5ca02e654d1098cfa587ca2ba861456fb7e1d062b1ca1c93454f",
    "client_token_accessor": "hmac-sha256:adfdc1bac3f61eb3542667df6edfd44525915fe4a208ae3e4c8b01044cc50c0d",
    "path": "auth/token/lookup-self",
    "data": null,
    "remote_address": "10.106.20.14",
    "wrap_ttl": 0,
    "headers": {}
  },
  "error": ""
}

Sending this to logstash beat port (5043) never worked as is.
We have the type field but we have a "time" field instead of "@timestamp".
Thing is, if we wanted to manipulate the JSON before hitting logstash, it would be dumb since it is logstash function to filter/convert fields.

So my question here is, besides manipulating "non-beats" logs by enforcing "type" and "@timestamp" fields, is there is a way to only use the beat input with logs not published by beats? Or is using the tcp input for other logs (which are not beat events) simply the right way?

Thanks in advance,
G.


(Christian Dahlqvist) #2

The beats input plugin implements the beats protocol which has additional functionality apart from just deserialising data e.g. supports for acknowledgement of events. Just sending JSON documents to it will therefore not work. That is however exactly what the TCP input is for, so you are doing the right thing.


#3

@Christian_Dahlqvist Thanks a bunch for clarifying, really appreciate! :ok_hand:


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.