I have documents with fields account_id(primary key for my relational db) and status. Whenever there is a change in the status of the account, I pushed the event to the Elasticsearch with the account_id and status.
So over a period of time there will be multiple documents for an account with different statuses.
My requirement is to get the recent document for each account_id and do a count in terms of the status.
SQL equivalent "select status, count(*) from account group by status ;"
Example: consider following logs
account_id: 1, status:delete, @timestamp: November 4th 2017, 18:42:09.445
account_id: 1, status:save, @timestamp: November 3rd 2017, 18:42:09.445
account_id: 1, status:draft, @timestamp: November 2nd 2017, 19:42:09.445
account_id: 2, status:draft, @timestamp: November 3rd 2017, 19:42:09.445
account_id: 3, status:save, @timestamp: November 4th 2017, 18:42:09.445
account_id: 3, status:draft, @timestamp: November 3rd 2017, 19:42:09.445
Hmm, after playing around with this, I'm not sure it's possible to do what you're wanting. I can't even really think of a way to get an Elasticsearch query to get the results you are wanting here without possibly using some sort of script. I think it boils down to not being able to do sub aggregations on a top hits aggregation.
Thanks for the response!! I wish that feature was in Kibana, it would have added a lot of value into our project.
For now as a workaround, I updated my logstash.conf so as to push the same log message to two indexes. In first index, I always create a new record and in the second index, I create/update the record.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.