I was wondering whether there is a practical way of querying existing documents in an index, find matching documents, and than write that that as a single document to a new index?
I'm collecting network traffic and have the these fields:
I have a lot of traffic so this quickly leads to lots of small documents and a heavy load on Elastic when performing big searches.
The idea is that after a while, it is not necessary to keep very detailed data anymore. E.g. after a month it is not necessary anymore to see the traffic at the second level, instead per day would be enough.
For that to work Logstash would need to search Elastic search for lets say a 24 hours period at a time, find documents where the source, destination, port and application match, sum up the in and out byes of all those fields and write the result to a new index as a single document.
The elastic filter appears it can do queries, but I'm not really sure it can handle all the necessary logic?