Any reasoning based on last-known state of entities is tricky in a distributed system where related records are scattered across nodes. For this sort of analysis you typically need to bring related data physically closer together in a special index.
This can be done in your client app by updating a dedicated entity-state document keyed on the entity ID or using the transform API to periodically fuse multiple related records into a single entity-centric document. This fusion process is simple for some operations (e.g. counting the number of urls a user has accessed using a cardinality agg) but requires more complex scripting for some things like recording data only from the last known state.
The code I linked to is used to fuse data in a transform job not a query.
You should read up on the transform API.
Fundamentally you need to fuse data into a new index which only holds the latest information for each user. You can then query that index to determine the breakdown of vote types.
The good news is that's sufficiently small that you may be able to get away with doing it in one search request without needing the transform API. When numbers are in the millions it requires too much memory.
The bad news is it's essentially a programming activity on your part. The scripted_metric gives you the framework to have custom scripts collect search result data from data nodes and fuse these at the coordinating node for a final result. It will involve the use of temporary maps etc. to organise results into buckets.
@Mark_Harwood thank you for the solution. I am new to scripted metrics and checking on the documentation now. Will be helpful if u can provide a sample script for this user based lastest aggregation using scripted metrics to get each vote count
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.