Restarting logstash cloudwatch plugin

We have an ELK stack app that has been down for over a month due to a credentials issue in the logstash cloudwatch plugin. The plugin is digesting data again now, but what is strange is that it is digesting logs from the beginning of time. So logs from over two years ago. Also, no data is being outputted to Elasticsearch, perhaps because that data has already been transformed and outputted previously?

My main question: is this typical behavior? I'm not very familiar with logstash and elasticsearch, but I can't imagine every time you restart logstash it starts digesting every cloudwatch log from the very first logs. Not sure if it will help, but here is the logstash conf fiel for the cloudwatch plugin:

input {
    cloudwatch_logs {
        access_key_id => access_here
        secret_access_key => secret_here
        log_group => [ "xwingui-Prod", "xwingui-Dev", "xwingui-Exp", "xwingui-Staging", "xwingui-Test", "xwingui-Jawn"  ]
        region => "us-west-1"
        sincedb_path => "/var/lib/.sincedb"
    }
}

filter {
    if "Monitoring - " in [message] {
        if "API" in [message] {
            grok {
                match => { "message" => "API Monitoring - %{GREEDYDATA:json}" }
            }
            mutate {
                add_field => { "monitorType" => "API" }
            }
        } else if "RUM" in [message] {
            grok {
                match => { "message" => "RUM Monitoring - %{GREEDYDATA:json}" }
            }
            mutate {
                add_field => { "monitorType" => "RUM" }
            }
        } else if "PikaWorker" in [message] {
            grok {
                match => { "message" => "PikaWorker Monitoring - %{GREEDYDATA:json}" }
            }
            mutate {
                add_field => { "monitorType" => "PikaWorker" }
            }
        } else if "DataAgent" in [message] {
            grok {
                match => { "message" => "DataAgent Monitoring - %{GREEDYDATA:json}" }
            }
            mutate {
                add_field => { "monitorType" => "DataAgent" }
            }
        } else if "Database" in [message] {
            grok {
                match => { "message" => "Database Monitoring - %{GREEDYDATA:json}" }
            }
            mutate {
                add_field => { "monitorType" => "Database" }
            }
        } 

        json {
            source => "json"
            remove_field => "message"
        }
        mutate {
            add_field => { "isMonitor" => True }
        }
    }
}

output {
    elasticsearch {
        hosts => [ "localhost:9200" ]
        user => user_here
        password => pwd_here
    }
    stdout {
        codec => json
    }
}

No. The input tracks what it has ingested in the sincedb. If "/var/lib/.sincedb" were removed then it would start over at the beginning, as you are seeing.

How do you know no data is going to elasticsearch? Could it be that your index rotation is automatically delete indexes containing two year old data?

I just checked the index lifecycle policies and it looks like they have been saving everything.

I checked the elasticsearch logs and there is nothing there.

/var/lib/.sincedb is still there, is there a way to check its contents?

more /var/lib/.sincedb should work. It is written as a text file. The number next to the group identifier is the timestamp of the last message that was read in milliseconds since the epoch (.strftime("%Q")).

Hmm interesting, the time is 1610132873632 which is today. So technically it shouldn't be ingesting all those old logs?

For example, I'm seeing logs like this

"log_stream":"root","ingestion_time":"2021-08-10T21:23:00.999Z"

The AWS API returns an array of events, each of which has log_stream_name, timestamp, message, ingestion_time, event_id fields. The [@timestamp] field is set from the timestamp field, not the ingestion_time field. Is the @timestamp current or in 2021? If the latter it sounds like the issue is on the AWS side.

This date is not today, it is the epoch in milisconds for January 08th, 2021.

You can check on bash by removing the 3 last numbers and running the following command:

date -d@1610132873

Ah thanks for that! I l had used an epoch converter but got the wrong date. That makes sense.

It appears the timestamp is in 2021 and not current

"@timestamp":"2021-12-10T21:25:23.645Z"

Shouldn't the .sincedb file be updated as new logs are ingested?

Would it make sense to simply manually edit the file?

That should work, or delete the entry for the group and set start_position => end. (You need to stop logstash before changing the file and restart it afterwards.)

What do you mean by delete the entry for the group? the sincedb_path?

Yes, you should be able to delete the line from the file (make sure to have a backup).

Sorry to ask the same question twice, but to make doubly sure. If I change this line in the cloudwatch conf file:

        sincedb_path => "/var/lib/.sincedb"

to

start_position => end

Logstash will essentially just process log groups that are brand new (sincedb time will be set to now)?

You would add the start_position => end to the existing configuration. Do not remove the sincedb_path.

Hey that worked like a charm and everything is running smoothly now. Thanks for all you help Badger!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.