Getting the documents counted under an aggregate


(Jonathan Dick) #1

We are presently in the process of evaluating Elastic + Kibana to support our care program. Under our current model, we have created a reporting framework on top of MySQL which makes it simple to obtain the list of results for a given report indicator. For example, we have a reporting telling us the number of patients active in care. A user can click on that number to get the list of patients meeting the criteria. This is useful for validating a query as well as informing a user about the individuals counted within a given report indicator. It adds transparency to the process allowing our users to feel confident that the reports we generate are true.

I have not had much luck finding a similar feature within Kibana + Elastic. Although, Kibana seems to make it easy to create dashboards, it's not obvious to me how to verify the numbers being generated by producing the list of documents meeting the given criteria. The best feature I've seen is the top_hits feature but I don't think this is intended to return a comprehensive list.

I am hoping though, such a feature exists and someone in the community can point me in the right direction.

Thanks!


(Joe Fleming) #2

to obtain the list of results for a given report indicator

it's not obvious to me how to verify the numbers being generated by producing the list of documents meeting the given criteria

This is mostly due to the fact that Elasticsearch does aggregations, and SQL does not. Aggregations allow you to deal with a very large set of information very quickly, and pull out important information about that data, but at the expense of losing references to the underlying data. So it's very fast to get things like counts and averages broken up by certain criteria across very large data sets, but only if you're fetching the aggregate information.

Elasticsearch does allow you to query for specific documents that meet a given criteria as well, and while you can create new columns on the fly using scripted fields, you can't "roll up" the data that way, you have to drop into aggregations, and then you lose access to the underlying data.

In your case, it sounds like you may not actually need the aggregation functionality at all, you can get by with queries and filters on the documents directly.

In terms of Kibana, Discover is where all the query and filter operations happen. You can generate a list of matching documents, and it'll show you an overall count of those documents, but that's it. Visualize is where aggregations happen, and that's how more complex visualizations are created. You can generate visualizations from Saved Searches you created in Discover, and in your case, that's probably what you want to do. An example would be to filter down the documents you want to see in Discover, save it as a Saved Search, and then create a visualization using that Saved Search. This will allow you to see the count of records, either aggregated over time, using a date historgram and a bar chart, or just see the current aggregated value, using a metric vis and setting the time picker to something like the last day.

Hopefully that makes sense.


(Jonathan Dick) #3

@Joe_Fleming thank you for the thoughtful reply. I appreciate the Elastic approach to aggregate data. I am surprised though that there isn't more interested in being able to verify the data at the document level. The request json in Kibana essentially has all the infromation necessary to repurpose that request into a document level request. By combining the query with the "scan" feature of elastic, it seems it would be relatively straightforward to produce the document level data.

I think (though it seems I may be in the minority) it would be very useful to add a feature to kibana which would allow you to click on an aggregate and produce the document list.

I would be interested to hear others' thoughts on this.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.