Elapsed filter doesn't work sometimes

Hello,

I am seeing Elapsed filter doesn't work all the times in our staging environment. In our development environment, we have single logstash and elasticsearch instance and Elapsed filter works fine there. But in staging, we have multiple logstash instances. I am not sure whether that's causing the issue or not. So far my observations are:

  1. If START and END of a job is within 0-1 second then it fails
  2. Sometimes for unknown reason it fails even if the job runs once (no multiple START/END) and finishes within 3600 seconds (Elapsed timeout)

Example:

0787|16132500|Z100_IN_LINE_CUBING_IDOC_WKSUB_2|2020-06-23 16:13:26|START|1|cp_SR0_02|Active
0787|16132500|Z100_IN_LINE_CUBING_IDOC_WKSUB_2|2020-06-23 16:14:26|END|1|cp_SR0_02|Canceled

Here:
job_id 16132500 is used as unique filed
job_name is Z100_IN_LINE_CUBING_IDOC_WKSUB_2
START and END events are tagged as job_start and job_finished by Filebeat

Logstash config:

    logstash-host-1$ grep -v \# logstash.yml
    ...
     node.name: abc_indexer_2670
     path.data: .../abc_enablers_indexer
     pipeline.workers: 1
     pipeline.batch.size: 1
     path.config: .../abc_enablers_indexer.conf
    ...
     xpack.monitoring.elasticsearch.hosts: "https://<elk-host-1>:40000"
    ...
     

     
    logstash-host-1$ cat .../abc_enablers_indexer.conf
    input {
            kafka {
                    bootstrap_servers => "<kafka-server-1>:30001,<kafka-server-2>:30001,<kafka-server-3>:30001,<kafka-server-4>:30001"
                    topics => ["abc_enablers"]
                    codec => "json"
            }
    }

    filter {
            date {
                    match => [ "ts", "yyyy-MM-dd HH:mm:ss" ]
                    target => "@timestamp"
                    locale => "en"
            }
    ...
            elapsed {
            start_tag => "job_start"
            end_tag => "job_end"
            unique_id_field => "job_id"
            periodic_flush => true
            }
    }

    output {
            elasticsearch {
                            hosts => ["https://<elk-host-1>:40000","https://<elk-host-2>:40000","https://<elk-host-3>:40000","https://<elk-host-4>:40000"]
                            index => "abc_enablers"
                            ssl => true
                            ...
    }
  }

Thanks,
Ferdous

The filter is not going to work is the START and END go through different logstash instances. How is traffic allocated to an instance in your staging environment?

Thank you for your response. That’s what I thought is causing issue. My guess is it was set as round robin scheme by our ELK admins.

I’m wondering how we could set it such a way that for this particular app (it has separate index and port), traffic will always go to same log stash instance e.g. logstash1. If logstash1 host is down due to maintenance then it will send all traffic to logstash2 host and so son?

Thanks,
Ferdous

Your load balancer admins could configure a VIP using failover rather than round-robin.

I will check with them. Thank you so much!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.