I have a pipeline definition to query ES and get just a specific set of data exported to files that I will process with an external application. Pipeline looks like
input {
elasticsearch {
hosts => "elasticsearch:9200"
index => "my_index"
query => '{ "query": { "bool": { "must": [ { "match_phrase": { "message": { "query": "WebHookHelper deleting AppID instanceId" } }} ] } }, "sort": ["@timestamp"] }'
}
}
output {
csv {
# This is the fields that you would like to output in CSV format.
# The field needs to be one of the fields shown in the output when you run your
# Elasticsearch query
fields => ["timestamp", "message"]
# This is where we store output. We can use several files to store our output
# by using a timestamp to determine the filename where to store output.
path => "/tmp/output/deletedApps-%{+YYYY-MM-dd}.csv"
#csv_options => {"col_sep" => "\t" "row_sep" => "\n"}
}
}
I have other pipelines that i have created because they populate the my_index
in the first place from some log files. What i end up seeing in the deletedApps files is that all the inputs, also the ones defined in other pipeline config files, are passed to this csv output but this is not my desired plan. I would like to just take only the output from ES matching this query and send it only to this output.
Is there a way to segregate input and outputs to explicitly say ... output only from this input etc. Using separate pipeline files does not seem to make a difference and it looks like it activates them all. So if i define another 12-appAcquisitions.conf where i put another input this input will also get processed by the output in the deletedApps pipeline.
The only solution i see at the moment is to run two instances of logstash but it feels like a waste of resources.
Can pipelines be run in isolation on the same logstash?