Label on Edges?

No worries. Ours is generally based on using a different approach - a "bottom-up" rather than "top-down" means of associating data. Wisdom of crowds emergent structures Vs curated content.

Modelling your data in a traditional graph database is often an act of censorship. Concrete edges are only created for the relationships in the source data that are assumed ahead of time to be useful. Many of the relationships that exist in the source document are not modelled (for instance, theoretically every single word in an email could be connected to the author). However, a search index acts as a way of automatically maintaining the associations between all of the values in all of the fields contained in the same document. It also knows every other document that contains any of these terms and maintains the frequency of every value - every IP address, word, number etc. Using this fully connected set of values we can cherry pick the connections using statistical approaches at query-time in order to build a graph on-the-fly of only the meaningfully associated values. The documents act as the glue that strengthen connections between terms.

These are a form of "emergent graph". One example is from the Enron emails and a search for project "Jedi" - it produced a connection to a 7 digit number that was found in the text used in several emails. It was the bank account number used for this off-balance-sheet company and of course much more significant than the rest of the everyday English terms used in these emails which we did not return. This is an example of "bottom up" connections.

Of course you can also use the Graph api to follow simple "edge" connections recorded as "A knows B" style documents but this is a subset of the features we employ for summarising richer documents. Right now we don't have special UI logic for the special case where one edge = one document but we may look at adding this in future.