The elasticsearch Logstash input filter seems to output more documents than the index it is set to input from. Below is the filter I'm using. What do I need to do to ensure it only inputs each record in the originating index once? Why does the new index show an increasing number of documents while logstash keeps running? The originating index has about 2 million records.
Note: my goal is to have two separate indexes, one with additional derived data, so the reindexer api is likely not what I want.
input {
elasticsearch {
hosts => "localhost:9200"
index => "test"
query => '{ "query": {"match_all": {}} }'
size => 10000
scroll => "5m"
}
}
filter {
mutate {
...
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "test-new"
}
}
Thanks!