[ANN] Elasticsearch RDF Jena Plugin

Hi,

do you want to turn your Elasticsearch into a SPARQL endpoint?

This Elasticsearch plugin stores and retrieves RDF triples by using the
Apache Jena API:

Apache Jena http://jena.apache.org is a free and open source Java framework
for building semantic web and Linked Data applications. The framework is
composed of different APIs interacting together to process RDF data.

Each triple will be stored as a single document, and Jena API uses term
filter queries to match triples. Due to the restrictions of such an
architecture, do not expect good SPARQL performance.

The implementation is based heavily on the work of Andrea Gazzarini's
SolRDF https://github.com/agazzarini/SolRDF

Feedback, contributions etc. are most welcome!

Best,

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHuAEvGTFU3R85LPMV_z7ZoKtLHp1G18ka5U2LuY51nOw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Great idea!

Need a clarification.

All of the java source files I opened say Apache license, but the readme
says AGPL.

Which license is it?

On Sun, Dec 28, 2014 at 12:01 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Hi,

do you want to turn your Elasticsearch into a SPARQL endpoint?

This Elasticsearch plugin stores and retrieves RDF triples by using the
Apache Jena API:

https://github.com/jprante/elasticsearch-plugin-rdf-jena

Apache Jena http://jena.apache.org is a free and open source Java
framework for building semantic web and Linked Data applications. The
framework is composed of different APIs interacting together to process RDF
data.

Each triple will be stored as a single document, and Jena API uses term
filter queries to match triples. Due to the restrictions of such an
architecture, do not expect good SPARQL performance.

The implementation is based heavily on the work of Andrea Gazzarini's
SolRDF GitHub - spaziocodice/SolRDF: An RDF plugin for Solr

Feedback, contributions etc. are most welcome!

Best,

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHuAEvGTFU3R85LPMV_z7ZoKtLHp1G18ka5U2LuY51nOw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHuAEvGTFU3R85LPMV_z7ZoKtLHp1G18ka5U2LuY51nOw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fzjjXBe0r5E%2BYwe1hbXidbmsYv7dZ4kFDE18xw%2B3%3D%2BvRw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

HI great news!! :slight_smile:
nice work, as well as the plugin from Gazzarini on Solr! :slight_smile:

I can't wait to seriously play a little with both... :slight_smile:

Have you already done some tests on the triples? I mean: is it possible to
use the ES replication to provide triples to other nodes, query on multiple
nodes via SPARQL etc?

PS: have you some plan about what we discussed here:
https://groups.google.com/forum/#!searchin/elasticsearch/json-ld/elasticsearch/hPu1e7TrL40/XHKd2WIeZrYJ
I'd like to find a way to separate mappings from data easily...

if I could give some help, please let me know!

Il giorno domenica 28 dicembre 2014 21:01:28 UTC+1, Jörg Prante ha scritto:

Hi,

do you want to turn your Elasticsearch into a SPARQL endpoint?

This Elasticsearch plugin stores and retrieves RDF triples by using the
Apache Jena API:

https://github.com/jprante/elasticsearch-plugin-rdf-jena

Apache Jena http://jena.apache.org is a free and open source Java
framework for building semantic web and Linked Data applications. The
framework is composed of different APIs interacting together to process RDF
data.

Each triple will be stored as a single document, and Jena API uses term
filter queries to match triples. Due to the restrictions of such an
architecture, do not expect good SPARQL performance.

The implementation is based heavily on the work of Andrea Gazzarini's
SolRDF GitHub - spaziocodice/SolRDF: An RDF plugin for Solr

Feedback, contributions etc. are most welcome!

Best,

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5f2647d8-c96f-4a0e-b98a-5ee728862d34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Apache - sorry for the README mistake, will be fixed.

Jörg

On Sun, Dec 28, 2014 at 9:19 PM, Jack Park jackpark@topicquests.org wrote:

Great idea!

Need a clarification.

All of the java source files I opened say Apache license, but the readme
says AGPL.

Which license is it?

On Sun, Dec 28, 2014 at 12:01 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Hi,

do you want to turn your Elasticsearch into a SPARQL endpoint?

This Elasticsearch plugin stores and retrieves RDF triples by using the
Apache Jena API:

https://github.com/jprante/elasticsearch-plugin-rdf-jena

Apache Jena http://jena.apache.org is a free and open source Java
framework for building semantic web and Linked Data applications. The
framework is composed of different APIs interacting together to process RDF
data.

Each triple will be stored as a single document, and Jena API uses term
filter queries to match triples. Due to the restrictions of such an
architecture, do not expect good SPARQL performance.

The implementation is based heavily on the work of Andrea Gazzarini's
SolRDF GitHub - spaziocodice/SolRDF: An RDF plugin for Solr

Feedback, contributions etc. are most welcome!

Best,

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHuAEvGTFU3R85LPMV_z7ZoKtLHp1G18ka5U2LuY51nOw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHuAEvGTFU3R85LPMV_z7ZoKtLHp1G18ka5U2LuY51nOw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAH6s0fzjjXBe0r5E%2BYwe1hbXidbmsYv7dZ4kFDE18xw%2B3%3D%2BvRw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAH6s0fzjjXBe0r5E%2BYwe1hbXidbmsYv7dZ4kFDE18xw%2B3%3D%2BvRw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGD1SGoVsdBttYyQS7UjYN%3DXtCJXpVbvftBibNtBYwXHw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

With this RDF Jena plugin, you can set up indices like any other index,
only the mapping is kind of strict:

https://github.com/jprante/elasticsearch-plugin-rdf-jena/blob/master/src/main/resources/org/xbib/elasticsearch/module/rdf/jena/mapping.json

That means, with replica settings, triples (like docs) are replicated to
other nodes and yes, they can be queried by SPARQL. I haven't tested it
much but it is easy now to build loaders who can lift some billions of
triples into ES and run SPARQL over it.

Being frankly, I am not a big fan of SPARQL implementations because SPARQL
is so cumbersome, slow, and difficult to use in RDF processing. I still
have to understand how Jena will exercise ES though. With the plugin, there
are several restrictions, like translating ORDER BY to ES sort clauses,
which is an open challenge.

With JSON-LD and ES being a document store, I am still after an alternative
method how to translate SPARQL queries to performant ES queries on ordinary
JSON docs, operating on documents which look more like a DESCRIBE result,
and contexts saved similar to templates in an internal index. This is my
favorite idea I have not dropped. I can now compare this better to a pure
Jena API plugin approach.

Jörg

On Sun, Dec 28, 2014 at 9:42 PM, Alfredo Serafini seralf@gmail.com wrote:

HI great news!! :slight_smile:
nice work, as well as the plugin from Gazzarini on Solr! :slight_smile:

I can't wait to seriously play a little with both... :slight_smile:

Have you already done some tests on the triples? I mean: is it possible to
use the ES replication to provide triples to other nodes, query on multiple
nodes via SPARQL etc?

PS: have you some plan about what we discussed here:

Redirecting to Google Groups
I'd like to find a way to separate mappings from data easily...

if I could give some help, please let me know!

Il giorno domenica 28 dicembre 2014 21:01:28 UTC+1, Jörg Prante ha scritto:

Hi,

do you want to turn your Elasticsearch into a SPARQL endpoint?

This Elasticsearch plugin stores and retrieves RDF triples by using the
Apache Jena API:

https://github.com/jprante/elasticsearch-plugin-rdf-jena

Apache Jena http://jena.apache.org is a free and open source Java
framework for building semantic web and Linked Data applications. The
framework is composed of different APIs interacting together to process RDF
data.

Each triple will be stored as a single document, and Jena API uses term
filter queries to match triples. Due to the restrictions of such an
architecture, do not expect good SPARQL performance.

The implementation is based heavily on the work of Andrea Gazzarini's
SolRDF GitHub - spaziocodice/SolRDF: An RDF plugin for Solr

Feedback, contributions etc. are most welcome!

Best,

Jörg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5f2647d8-c96f-4a0e-b98a-5ee728862d34%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/5f2647d8-c96f-4a0e-b98a-5ee728862d34%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFcMoSmJmc4yJWZi5FsZqiGKkMD4k%2BFrbjeTYOaqV3hmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.