Entries Duplication

Hi,

when I have 2 pipelines active at the same time, the data from one is somehow duplicated/redirected to the other pipeline, when I'm not using pipeline-to-pipeline communication.
The output host of both are the same, however, the input is different.
I was expecting to get the data from one input and send it thought the output configuration.

pipelines.yml

- pipeline.id: main
  path.config: "/usr/share/logstash/pipeline"

logstash.yml

http.host: "0.0.0.0"
log.format: json

log.level: debug

queue.type: persisted
queue.checkpoint.writes: 1

pipeline 1:

input {
    beats {
        port => 5001
    }
}

output {
    elasticsearch {
        hosts => ["elasticsearch:9200"]
        index => "index-pipeline-1"
    }
}

pipeline 2:

input {
	tcp {
		port => 5000
		codec => json
	}
}

input {
  udp {
    port  => 5000
    codec => json
  }
}

output {
	elasticsearch {
		hosts => ["elasticsearch:9200"]
		index => "index-pipeline-2"
	}
}

here are some logstash logs:

logstash_1       | 2020-09-03T19:48:36.247934873Z [DEBUG] 2020-09-03 19:48:36.247 [[main]>worker2] MainClientExec - Connection can be kept alive indefinitely
logstash_1       | 2020-09-03T19:48:36.248892967Z [DEBUG] 2020-09-03 19:48:36.248 [[main]>worker2] PoolingHttpClientConnectionManager - Connection [id: 1][route: {}->http://elasticsearch:9200] can be kept alive indefinitely
logstash_1       | 2020-09-03T19:48:36.248964722Z [DEBUG] 2020-09-03 19:48:36.248 [[main]>worker2] PoolingHttpClientConnectionManager - Connection released: [id: 1][route: {}->http://elasticsearch:9200][total kept alive: 1; route allocated: 1 of 100; total allocated: 1 of 1000]
logstash_1       | 2020-09-03T19:48:36.256499805Z [DEBUG] 2020-09-03 19:48:36.256 [[main]>worker2] RequestAuthCache - Auth cache not set in the context
logstash_1       | 2020-09-03T19:48:36.256540171Z [DEBUG] 2020-09-03 19:48:36.256 [[main]>worker2] PoolingHttpClientConnectionManager - Connection request: [route: {}->http://elasticsearch:9200][total kept alive: 1; route allocated: 1 of 100; total allocated: 1 of 1000]
logstash_1       | 2020-09-03T19:48:36.257890964Z [DEBUG] 2020-09-03 19:48:36.257 [[main]>worker2] wire - http-outgoing-0 << "[read] I/O error: Read timed out"
logstash_1       | 2020-09-03T19:48:36.257958752Z [DEBUG] 2020-09-03 19:48:36.257 [[main]>worker2] PoolingHttpClientConnectionManager - Connection leased: [id: 0][route: {}->http://elasticsearch:9200][total kept alive: 0; route allocated: 1 of 100; total allocated: 1 of 1000]
logstash_1       | 2020-09-03T19:48:36.258081806Z [DEBUG] 2020-09-03 19:48:36.257 [[main]>worker2] DefaultManagedHttpClientConnection - http-outgoing-0: set socket timeout to 60000

Thank you!

If you point path.config to a directory containing multiple configuration files then they are concatenated and events are read from every input, sent through all the filters, and written to all the outputs. You have to use pipelines.yml to run configuration files in independent pipelines.

@Badger updated original post with pipelines.yml and logstash.yml

Oh... so you are saying that instead have the pipeline.yml like i posted, i have to use something like

- pipeline.id: my-pipeline_1
  path.config: "/etc/path/to/p1.config"
  pipeline.workers: 3
- pipeline.id: my-other-pipeline
  path.config: "/etc/different/path/p2.cfg"
  queue.type: persisted

pointing a pipeline per file?

Exactly! Although I do not think the files have to be in different directories. Provided path.config points to a file rather than a directory I would expect you to be OK.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.