How to Use Graph

graph

(Omar ) #1

Hello,

We are using graph extension in Kibana to visualize relation between people, we have the following table in elasticsearch with Person1 as the index :

Person1 Person2 Score
A B 2
A C 3
C A 1

If we visualize the data in Kibana-Graph then we will have a A with two relation with B and C and another node C with a relationship to A. (we will have basically two graph because C is considered as a new node).

How can we have only one graph with all the relationship ?

Thank you


(Mark Harwood) #2

The issue is that the term Person1:C is seen as a different term to Person2:C.
Graph relies on terms where the fieldname and the value are the same.

A solution is to create a single multi-value field for each doc e.g.

"person" : ["A", "C"]

This field can also be created as part of indexing rather than changing your source JSON if you use the copy_to feature in your index mapping


(Omar ) #3

Thank you Marc for the answer.

So I will have :
"person" : ["A", "B"]
"person" : ["A", "C"]
"person" : ["C", "A"]

Right ?

How Can I do that as part of indexing ? We are importing the data to elastcisearch from Hive

Thanks


(Mark Harwood) #4

It can be part of the mapping (think of it as a schema) for the index.
In this example [1] the source documents only contain first_name and last_name values but they are automatically copied into an index field called full_name. You can do the same trick with your person1 and person2 fields into a person field.
Your Hive code doesn't have to change - only the mapping used for the index you load the data into.

[1] https://www.elastic.co/guide/en/elasticsearch/reference/2.4/copy-to.html


(Omar ) #5

Perfect !! Working great now

Thanks for the tip :smile:


(Omar ) #6

Another question.

In the graph, I want the score to be the edge between the nodes, how can I do that ?

For example, I want to have three nodes A, B and C and the score to be the value of the links between them.

Thank you


(Mark Harwood) #7

In elastic Graph single edges represent a collection of documents that contain a pair of terms. An edge could represent a billion documents, or as in your case, it could be just one.

When you want to view properties of an edge you can use the search API to retrieve individual documents or perhaps use the "aggregations" feature to summarise properties of many docs e.g. show a line graph of volumes of transactions between 2 accounts over time. The next version of the Graph UI [1] provides "drill-down" features which make these calls for you and provide visualizations.

I'm not sure what data/approach you use to compute your scores in advance but it is worth noting that elastic Graph can derive scores on-the-fly from raw data e.g. looking at StackOverflow posts it can summarise who is strongly connected with who based on the volumes of exchanges on StackOverflow over time. We use this to prioritise which connections to explore first.

[1] https://twitter.com/elasticmark/status/771808866896080897


(system) #8