Use case - Get last document over multi document types at time X for all business IDs


I am strongly investigating the potential of ES for a use case we have but I am not really sure if I cannot make it do what I want it to do or if it is actually not designed for this. Any advice appreciadted

Say I am processing business objects (millions) are in particular states in one of several different systems.
Every time an object changes state, I receive a JSON document from that system describing the new state.
Each systems have different JSON mappings, but all contains the object ids. An object can only be inn one state and one system at once, and follows a flow: system 1 -> n.

Is ES able to answer these two questions:

  1. for all object ids, can I have the last state (or "last document ingested") by any of the systems at time T? ie. canI get the list of all last document by business object id at defined time T?
  2. Is it possible to aggregate on a state contained in the documents and have a dashboard in kibana based on this?

I tried to achieve on a local instance, but can't me head around the search feature for this use case .

Any idea :slight_smile: ?

No one can help :slight_smile: ?

This should be possible with aggregations. Terms aggregation on Object ID, which will give you a bucket-per-ID. Then inside each bucket you could use a TopHits aggregation sorting on timestamp to retrieve the most recent document for that ID.

The main issue you might run into is that the Terms agg is designed for "top-n" situations, so if you want to retrieve all IDs and have a sizable amount, you may run into performance issues. The newer Composite aggregation might work better for you, since it is designed to "paginate" across the aggregation buckets in a more memory-friendly manner.

No idea about Kibana I'm afraid. It's probably doable, but I don't have a ton of experience building dashboards.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.