Hey guys,
i want to do some feedback about the pipeline aggregations and the examples that have been brought till now.
When i first read about pipeline aggregation is was like: "Oh very cool - now i don´t need shell scripts for more complex searches".
In my mind pipeline aggregation was like: "take the results of the first aggregation and put it in an parameter of the next aggregation", something like
agg1 | agg2 | agg3
but than i read that your concept is that the pipeline aggregation, don´t perform any more searches on shards but rather parse the results of the initial aggregation and do some stuff with it (at least in the examples it mostly involved some fields with numeric values)
Let me give an example of what i thought you could do with pipeline aggregation:
A Sysadmin wants to know which IP-address have accessed port x, port y and port z.
If you want to achieve this now, you could do an aggregation on field IP with port:x as querystring,
take the resulting ip-addresses of this aggregation and put it into an filter of the next aggregation, where port:y is set as querystring, do the same thing again with the results of the second aggregation and put it in the last aggregation with port:z as querystring.
This methods works, but requires scripting on the client-side.
I know the ES was made for a various types of use cases and i do understand that the current concept of the pipeline aggregation is very suitable for many use-cases.
I only can speak from the perspective of an "ES-logging"-user, so whats your opinion about this?