Hi, new user here, so excuse the basic question. How do I setup data to appear in graph format? Is there an example dataset you provide for better understanding.
If you're planning on using the Kibana Graph UI (and Kibana in general) then the typical advice is to index information so that the tokens used to represent the things you want to report on are:
Unambiguous
Readable
So, as an example:
Email addresses, hashtags or domain names work great without any changes
Bank account numbers could ideally do with the customer name appended
Customer names are not reliably unique and could ideally do with customer IDs attached
This is part of general preparation of content for analysis - computers want unique IDs but people want to read labels and if your indexed strings serve both purposes you avoid the cost of expensive joins at search time that can otherwise limit scalability.
So, the data you reference has a lot of IDs but lacks labels. Having come from the same people who provided OffshoreLeaks and PanamaPapers datasets I imagine it is a similar format and needs a similar labelling treatment. The PanamaPapers blog post [1] I wrote contains scripts to load this sort of data and index appropriately.
This blog post also describes the settings you need to turn on for this "forensics" type work as the default settings are more tuned for "wisdom of crowds" scenarios where edges only appear if enough docs/people assert there is a strong-enough relationship to draw out.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.