Dated graph modelisation: how to get the most recent edges

Hi,

I'm trying to store a graph into Elasticsearch. The use case is pretty simple: nodes are static, but edges can be created / deleted and its fields can be modified over the time. I want to be able to build the graph at a specific timestamp.

For example, I have:

  • three nodes "A", "B" and "C" with only one attribute name,
  • the following edges actions with weight attribute:
    • timestamp=1
      • creation of A ---2.1---> B
      • creation of B ---4.8---> A
      • creation of B ---0.2---> C
    • timestamp=2
      • deletion of A ----------> B
      • update of B ---0.3---> C

The following node and edgeEvt documents are stored in ES:

{ "type" : "node", "name" : "A" }
{ "type" : "node", "name" : "B" }
{ "type" : "node", "name" : "B" }

{ "timestamp" : 1 , "type" : "edgeEvt", "src" : "A", "dest" : "B", "event" : "created", "weight" : 2.1 }
{ "timestamp" : 1 , "type" : "edgeEvt", "src" : "B", "dest" : "A", "event" : "created", "weight" : 4.8}
{ "timestamp" : 1 , "type" : "edgeEvt", "src" : "B", "dest" : "C", "event" : "created", "weight" : 0.2}
{ "timestamp" : 2, "type" : "edgeEvt", "src" : "A", "dest" : "B", "event" : "deleted"}
{ "timestamp" : 2, "type" : "edgeEvt", "src" : "B", "dest" : "C", "event" : "updated", "weight" : 0.3}

From this data, I would-like to build a graph @t2, then I need to query Elasticsearch as following: Give me all edgeEvt documents with timestamp<=2, and for each "src"/"dest" unique value give me the most recent of them.

Result of my query should be:

[
  { "timestamp" : 2, "type" : "edgeEvt", "src" : "A", "dest" : "B", "event" : "deleted" },
  { "timestamp" : 1, "type" : "edgeEvt", "src" : "B", "dest" : "A", "event" : "created", "weight" : 4.8 },
  { "timestamp" : 2, "type" : "edgeEvt", "src" : "B", "dest" : "C", "event" : "updated", "weight" : 0.3 }
]

Any idea?

Thx

Off the top, I'd say filter on your timestamp criteria, and do a terms agg on "src", with a terms sub-agg on "dest", with a tophits sub-agg sorted descending on timestamp. (You can specify size: 1 on the top hits agg if you really only want to see the latest, but it's going to be the first item, regardless. By specifying a larger size, you can see the history of every src/dest pair each in its own top hits node.)

It's working :smile:

Amazing! Thanks!!