Hi all, I am new to ElasticSearch and currently I am doing research and implementing a concept for a small project. I would like to index the DBpedia RDF datasets using ElasticSearch. The RDF datasets will be stored in Apache Fuseki and I would like to stream these datasets into ElasticSearch for indexing. I found the following possibilities:
https://github.com/elastic/stream2es
Suggests to use Logstash, although there already seems to be functionality to stream Wikipedia datasets into Elasticsearch.
Logstash
Regarding Logstash, I am a bit lost since from my understanding Logstash gives you the facility to stream logs into Elasticsearch.
On which option I should concentrate my efforts? Are there any alternatives? It seems that there is no ready made solution to index RDF datasets.
I dunno about Elaticsearch for RDF in general because it can't arbitrarily join and RDF is all joins. You can use Elasticsearch for the full text querying though.
I indexed the DBpedia link structure in elasticsearch and explored it using the Graph UI which can be used to give priority to significant links in the data (significant != popular). There's a video demo here [1] and if it looks like it is of interest I can share how this demo was put together with you.
That is great Mark! This is exactly what I need.
Yes please, Mark I want to know how the demo was put together.
The Graph UI is incredible; I tested it using the Shakespeare dataset and the experience was just awesome. For sure it will be awesome using the Graph UI on the DBpedia datasets.
Check this gist [1] for a python script to load dbpedia data [2] into 5.3+ elasticsearch.
Each elasticsearch doc is a single wikipedia article with an array of the other articles it links to.
Using the Graph api/UI in x-pack [3] you can explore strongly-associated subjects (those subjects that are found to be commonly paired together in articles' linked_subjects field).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.