Use only one input for one output. How to segregate/isolate pipelines definitions input/output?

I have a pipeline definition to query ES and get just a specific set of data exported to files that I will process with an external application. Pipeline looks like

input {
 elasticsearch {
 hosts => "elasticsearch:9200"
 index => "my_index"
 query => '{ "query": { "bool": { "must": [ { "match_phrase": { "message": { "query": "WebHookHelper deleting AppID instanceId" } }} ] } }, "sort": ["@timestamp"] }'
 }
}

output {
 csv {
# This is the fields that you would like to output in CSV format.
# The field needs to be one of the fields shown in the output when you run your
# Elasticsearch query
 fields => ["timestamp", "message"]
# This is where we store output. We can use several files to store our output
# by using a timestamp to determine the filename where to store output.
 path => "/tmp/output/deletedApps-%{+YYYY-MM-dd}.csv"
 #csv_options => {"col_sep" => "\t" "row_sep" => "\n"}
 }
}

I have other pipelines that i have created because they populate the my_index in the first place from some log files. What i end up seeing in the deletedApps files is that all the inputs, also the ones defined in other pipeline config files, are passed to this csv output but this is not my desired plan. I would like to just take only the output from ES matching this query and send it only to this output.

Is there a way to segregate input and outputs to explicitly say ... output only from this input etc. Using separate pipeline files does not seem to make a difference and it looks like it activates them all. So if i define another 12-appAcquisitions.conf where i put another input this input will also get processed by the output in the deletedApps pipeline.

The only solution i see at the moment is to run two instances of logstash but it feels like a waste of resources.

Can pipelines be run in isolation on the same logstash?

Either use conditionals to limit which events reach which output(s) or use the multi-pipeline feature in Logstash 6+.

I have tried using pipelines in ES 6 , I have placed the file /usr/share/logstash/config/pipelines.yml (notice this is where the path.settings as mentioned in the multiple pipelines page should be in the logstash docker container as far as i can see, but also i dont see the path.settings mentioned in the latest logstash config/settings page so a bit confusing) with content

- pipeline.id: my_app_main
 path.config: "/usr/share/logstash/pipeline/09-myapp.conf"
- pipeline.id: my_app_deleted_apps
 path.config: "/usr/share/logstash/pipeline/11-deleted-apps.conf"

This did not produce the csv output. I then also tried with an if condition where i added tags => [ "deletedApps" ] in the elasticsearch input and

output {
 if "deletedApps" in [tags] {
 csv {
 fields => ["timestamp", "message"]
 path => "/tmp/output/deletedApps-%{+YYYY-MM-dd}.csv"
 }
 }
}

This also didnt output any files in the output folder. I am using the containers so the only difference is for sure this configuration. It cannot be the problem of the query on the input because this works when i use it in Kibana.

As a last resort I did run a separate instance of logstash just to ensure separation, without if statements or pipelines as it was not needed now, and still it was not producing the csv output.

I am thinking there might be a problem with the elasticsearch input but what could it be?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.