Logstash ingest pipeline - Data from one pipeline going to another

Hello,

I hope my message finds the members of the community and their loved ones safe and healthy.

I am running logstash (7.15.2) on a Raspberry Pi 4B running Ubuntu 20.04.3 LTS.

I have seven pipelines, all listening on different ports defined in pipelines.yml.

I have data going on pipeline into an index of another pipeline & also in its original pipeline/index. In both cases, the source is unique filebeat instances running on the same host.

Following are the details:

1. Source and Pipeline details:

Source : Filebeat on the remote host defined in and not installed as a service & running through /etc/rc.local. It reads a particular .log file from /var/log

Pipeline (destination):

Port: 5055
Configuration:

input {
  beats {
    port => 5055
    type => "logs"
  }
}

filter {
 
REMOVED
}

output {
  elasticsearch {
   hosts => ["IP REDACTED"]
   index => "cowrie-firewall-logstash-%{+yyyy.MM.dd}"
   ssl => true
   user => '**REDACTED**'
   password => '**REDACTED**'
   cacert => '/etc/logstash/elasticsearch-ca.pem'
   ssl_certificate_verification => true
   ilm_enabled => auto
   ilm_rollover_alias => "cowrie-firewall-logstash"
  }
}

2. Source and Pipeline details:

Source : Filebeat installed as a service reading a .json file from /srv/cowrie/var/log/cowrie/

Pipeline (destination):

Port: 5045
Configuration:

input {
       # filebeats
       beats {
             port => 5054
             type => "cowrie"
             #id => "honeypot_ingest"
       }

       # if you don't want to use filebeat: this is the actual live log file to monitor
       #file {
       #       path => ["/home/cowrie/cowrie-git/log/cowrie.json"]
       #       codec => json
       #       type => "cowrie"
       #}
}

REMOVED

output {
    if [type] == "cowrie" {
        elasticsearch {
             hosts => ["IP REDACTED"]
            #data_stream => true  #Causes Errors: added after reading this: https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-data_streamwhile diagnosing cowrie ingestion causing data duplication.
            index => "cowrie-logstash-%{+yyyy.MM.dd}"
            ssl => true
            user => '**REDACTED**'
            password => '**REDACTED**'
            cacert => '/etc/logstash/elasticsearch-ca.pem'
            ssl_certificate_verification => true
            ilm_enabled => auto
            ilm_rollover_alias => "cowrie-logstash"
        }
        #file {
        #    path => "/tmp/cowrie-logstash.log"
        #    codec => json
        #}
        #stdout {
            #codec => rubydebug
        #}
    }
}

Even though logs from the pipeline 1 - cowrie-firewall should are pointed to port 5055 they are also present in pipeline 2 - cowrie-logs which is listening on 5045 port.

How can i remove the duplication?

what does pipelines.yml look like?

Hello, thank you for your reply. Here is the pipelines.yml

# This file is where you define your pipelines. You can define multiple.
# For more information on multiple pipelines, see the documentation:
#   https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html
#09-06-2021: added new pipeline for pihole
#11-10-2021: Added mew pipeline for cowrie_firewall logs.


- pipeline.id: honeypot_ingest
  path.config: "/etc/logstash/conf.d/cowrie.conf"

- pipeline.id: beats_ingest
  path.config: "/etc/logstash/conf.d/beats.conf"

- pipeline.id: packetbeat_ingest
  path.config: "/etc/logstash/conf.d/packetbeat.conf"

- pipeline.id: pihole_ingest
  path.config: "/etc/logstash/conf.d/pihole.conf"

- pipeline.id: vpn_ingest
  path.config: "/etc/logstash/conf.d/vpn.conf"

#- pipeline.id: vmware_ingest
#  path.config: "/etc/logstash/conf.d/vmware_vsphere.conf"

- pipeline.id: cowrie_firewall_ingest
  path.config: "/etc/logstash/conf.d/cowrie_firewall.conf"


- pipeline.id: filebeat_oxford
  path.config: "/etc/logstash/conf.d/filebeat_oxford.conf"

that config shouldn’t mix afaik.

if you have verified that each filebeat instance sends to a the correct port, then maybe file an issue in github? i can’t think of any other reason, and haven’t encountered such problem with specific config file

How are you running logstash? As a service?

Sure let me have a look at filing a bug.

Hello,

Logstash is installed via APT on the Raspberry Pi. It is running a service.

Here are the two different installs of filebeat (one through APT and one using the compressed file)

They are running on different configurations.

Yes I just double checked the same and it clearly shows the correct port.

Assuming the port is mixed up, it would be odd that the index created for firewall logs (collected via port 5055) is having new events, right?

Issue created:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.