Adding additional fields for Logstash Kinesis Input

Hi everyone,

I'd like to ask for some help from you please about adding additional fields to Logstash Kinesis Input.

I have three logstash servers consuming a Kinesis Stream using Kinesis Input Plugin.

Example like the below:-

    input {
        kinesis {
        application_name => "test-kinesis"
        kinesis_stream_name => "test-kinesis"
        region => "ap-southeast-1"
        profile => "test-kinesis"
        initial_position_in_stream => "TRIM_HORIZON"
        codec => cloudwatch_logs
        id => "kinesis-rq-log-in"
       }
    }

At each time only one server is reading until it fails. However, the document from the index only contains the information from the record.

    {
      "_index": "linux-default-2020.06.29",
      "_type": "doc",
      "_id": "pLHfAHMBpFWsf2X_halE",
      "_version": 1,
      "_score": null,
      "_source": {
        "message": "sanitised linux message",
        "@timestamp": "2020-06-29T16:18:34.923Z",
        "tags": [
          "kinesis_syslog"
        ],
        "subscriptionFilters": [
          "cloudwatchsubscriptionfilter"
        ],
        "@version": "1",
        "awsid": "1234567891011",
        "messageType": "DATA_MESSAGE",
        "logStream": "myhostname,i-1234567abcedfg",
        "logGroup": "/var/log/audit/audit.log"
      },
      "fields": {
        "@timestamp": [
          "2020-06-29T16:18:34.923Z"
        ]
      },
      "sort": [
        1593447514923
      ]
    }

Since it does not have any information about the logstash server itself, I don't know which server is the current consumer of the stream.

May I ask if there is a way to use 'add_field' to add a new field to the document, for example the IP or the hostname of the logstash collector? Obviously I can add a static field but since I wonder if I could dynamically acquire the information without hard code this into the configuration.

Many thanks in advance,

James Ren

This could help:

Thanks for the quick response Jenni! I think this is what I'm looking for.

Before I try out in our system, may I check in this case, does the word 'event' represent the document itself?

Would the 'host' field appear under 'fields' section?

"fields": {
    "syslog_timestamp": [
      "2020-07-08T03:00:00.000Z"
    ],
    "@timestamp": [
      "2020-07-08T03:00:00.000Z"
    ]
  }

Thanks a ton!

James

Yes, event is the name of the variable in Ruby that contains an object with all your fields. Make sure to use the syntax from the second link (event.set(...)) because the old (event[...] = ...) is deprecated. You'll create a normal string field in your document _source this way.

Hi Jenni,

Thanks again for your help on this. I have managed to apply the change to the cluster and interestingly I can see the 'host' appeared actually inside '_source' content.

{
  "_index": "linux-default-2020.07.14",
  "_type": "_doc",
  "_id": "lSKnTHMBvQOEepptksW0",
  "_version": 1,
  "_score": null,
  "_source": {
    "logGroup": "os-logs",
    "program": "amazon-ssm-agent",
    "logsource": "aserver",
    "awsid": "1234567891011",
    "@version": "1",
    "tags": [
      "kinesis_syslog",
      "linux"
    ],
    "id": "35563420008286253344226643814350380985951041854882643968",
    "logStream": "i-0123456789abcdefg",
    "messageType": "DATA_MESSAGE",
    "message": "sanitised message",
    "syslog_timestamp": "2020-07-14T10:28:27.000Z",
    "@timestamp": "2020-07-14T09:28:27.000Z",
    "subscriptionFilters": [
      "subscriptionFilters"
    ],
    "host": "10.10.10.10"    
  },
  "fields": {
    "@timestamp": [
      "2020-07-14T09:28:27.000Z"
    ],
    "syslog_timestamp": [
      "2020-07-14T10:28:27.000Z"
    ]
  },
  "sort": [
    1594718907000
  ]
}

Hi Jenni,

Just for your awareness, when I enabled this function it worked fine. But I have observed significant delay in the number of processed events and the latency up to 4 times increase.

Once I commented out it, the events went back to the previous level.

Just wonder since I used the 'init' script which was supposed to take less hit on performance, what had caused this delay?

Yours,

James

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.