X-Pack Graph Significance Level


Simple question here. It is my understanding that X-Pack graph is a network analysis of sorts. Is there any documentation which details the underlying calculations which are performed to generate the data which is then visualized by Kibana.

In particular, I am interested in the use_significance parameter. Can it be configured? Is it an integer? I understand that when I turn it to 'false', more connections appear, as expected, but what are the methods behind this calculation. Any insight would be helpful, thank you in advance.


Yes, "uncommonly common" is the phrase we often use to describe the sorts of things we find significant. Those things that are common in your result set but comparatively rare in the background data. TF-IDF ranking is a form of this at a document level (words common in the document but uncommon generally).

I have a visualisation that shows the significant terms "popping" here and how sampling is an important factor too: https://www.youtube.com/watch?v=azP15yvbOBA



Thank you for the insight. So it seems that the relevance ranking algorithm is based term frequency/inverse document frequency and the vector space model. So, if I understand this correctly when I have "Significant Links" checked, the vertices, or connections that are plotted in the X-Pack Graph, are a visual representation ‘relevance score’ if it is above a certain threshold deemed significant?

I guess what I am looking for is precise verbiage to describe what is happening on the backend of the graph when significance is checked or unchecked, and secondly, but less importantly, whether or not is the backend configurable in this regard.

Thank you,

Generally it's not so much about setting a threshold, rather prioritising the connections shown.
A search engine uses relevance ranking to serve the most interesting documents first.
The Graph UI uses a similar approach to surface the most interesting connections first.
The UI has the option to return a user-configurable number of field values. If you select a node and hit the + button you'll get the first N significantly connected other nodes/values. Hitting the "+" button again while keeping the original selection will add the next most relevant N. Like paging through docs in search results.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.