ElasticSearch synchronization with OrientDB

Hello,

I would like some guidelines about how to approach ElasticSearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one:
    https://github.com/sksamuel/elasticsearch-river-neo4j

In this google group discussion
(https://groups.google.com/d/msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ)
someone said that it could be done using hooks api of OrientDB that is more
efficient than just pooling. Any thoughts on that?

  1. JDBC River plugin ( https://github.com/jprante/elasticsearch-river-jdbc)
  • I could just use this one since OrientDB is providing a JDBC driver.
    Could there be compatibility problems?

A person suggested Single point of failure problems with river plugins and
I read that river plugins could be deprecated (?) in the future
(http://stackoverflow.com/questions/22237111/preferred-method-of-indexing-bulk-data-into-elasticsearch).
I don't know if SPOF is actually a reality as I see river plugins used for
many types of resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as it
    was done with Lucene https://github.com/orientechnologies/orientdb-lucene.
    Could that have scalability problems for either OrientDB or Elastic Search.
    I guess this cannot take the full advantage of Elastic Search and it is
    using that for querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You can also DIY and leverage the official clients.

Be aware that in the long run that rivers are being deprecated.

On 25 February 2015 at 06:25, Michalis Michaelidis mmichaelid@gmail.com
wrote:

Hello,

I would like some guidelines about how to approach Elasticsearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one:
    https://github.com/sksamuel/elasticsearch-river-neo4j

In this google group discussion (
https://groups.google.com/d/msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ)
someone said that it could be done using hooks api of OrientDB that is more
efficient than just pooling. Any thoughts on that?

  1. JDBC River plugin ( GitHub - jprante/elasticsearch-jdbc: JDBC importer for Elasticsearch)
  • I could just use this one since OrientDB is providing a JDBC driver.
    Could there be compatibility problems?

A person suggested Single point of failure problems with river plugins and
I read that river plugins could be deprecated (?) in the future (
sql server - Preferred method of indexing bulk data into ElasticSearch? - Stack Overflow).
I don't know if SPOF is actually a reality as I see river plugins used for
many types of resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as it
    was done with Lucene GitHub - orientechnologies/orientdb-lucene: Lucene indexes for OrientDB.
    Could that have scalability problems for either OrientDB or Elastic Search.
    I guess this cannot take the full advantage of Elastic Search and it is
    using that for querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-b3POi5eDRBufLVYvXN4odvMn%3DfX27rbv-RJPu7J3mmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thank you,

I have that in mind. What do you mean with official clients? I don't think
it is a good idea to hit both orientdb and elasticsearch when I am
inserting something for example..

Τη Τρίτη, 24 Φεβρουαρίου 2015 - 3:16:20 μ.μ. UTC-5, ο χρήστης Mark Walkom
έγραψε:

You can also DIY and leverage the official clients.

Be aware that in the long run that rivers are being deprecated.

On 25 February 2015 at 06:25, Michalis Michaelidis <mmich...@gmail.com
<javascript:>> wrote:

Hello,

I would like some guidelines about how to approach Elasticsearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one:
    https://github.com/sksamuel/elasticsearch-river-neo4j

In this google group discussion (
https://groups.google.com/d/msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ)
someone said that it could be done using hooks api of OrientDB that is more
efficient than just pooling. Any thoughts on that?

  1. JDBC River plugin (
    GitHub - jprante/elasticsearch-jdbc: JDBC importer for Elasticsearch) - I could just use
    this one since OrientDB is providing a JDBC driver. Could there be
    compatibility problems?

A person suggested Single point of failure problems with river plugins
and I read that river plugins could be deprecated (?) in the future (
sql server - Preferred method of indexing bulk data into ElasticSearch? - Stack Overflow).
I don't know if SPOF is actually a reality as I see river plugins used for
many types of resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as
    it was done with Lucene
    GitHub - orientechnologies/orientdb-lucene: Lucene indexes for OrientDB. Could that have
    scalability problems for either OrientDB or Elastic Search. I guess this
    cannot take the full advantage of Elastic Search and it is using that for
    querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Official clients are listed here - Elasticsearch Platform — Find real-time answers at scale | Elastic

On 26 February 2015 at 12:09, Michalis Michaelidis mmichaelid@gmail.com
wrote:

Thank you,

I have that in mind. What do you mean with official clients? I don't
think it is a good idea to hit both orientdb and elasticsearch when I am
inserting something for example..

Τη Τρίτη, 24 Φεβρουαρίου 2015 - 3:16:20 μ.μ. UTC-5, ο χρήστης Mark Walkom
έγραψε:

You can also DIY and leverage the official clients.

Be aware that in the long run that rivers are being deprecated.

On 25 February 2015 at 06:25, Michalis Michaelidis mmich...@gmail.com
wrote:

Hello,

I would like some guidelines about how to approach Elasticsearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one: https://github.com/sksamuel/
    elasticsearch-river-neo4j

In this google group discussion (https://groups.google.com/d/
msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ) someone said that it
could be done using hooks api of OrientDB that is more efficient than just
pooling. Any thoughts on that?

  1. JDBC River plugin ( jprante (Jörg Prante) · GitHub
    elasticsearch-river-jdbc) - I could just use this one since OrientDB is
    providing a JDBC driver. Could there be compatibility problems?

A person suggested Single point of failure problems with river plugins
and I read that river plugins could be deprecated (?) in the future (
sql server - Preferred method of indexing bulk data into ElasticSearch? - Stack Overflow
method-of-indexing-bulk-data-into-elasticsearch). I don't know if SPOF
is actually a reality as I see river plugins used for many types of
resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as
    it was done with Lucene https://github.com/orientechnologies/orientdb-
    lucene. Could that have scalability problems for either OrientDB or
    Elastic Search. I guess this cannot take the full advantage of Elastic
    Search and it is using that for querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9zDaL-_dmQVOxEjojnkB_6VHOW_pWO3mK6i-XCoLscyQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

@Michalis_Michaelidis Have you found out any solution? Need help!

No I didn't find anything.. Actually I moved away from OrientDB due to so many bugs that exist in their code base and SQL graph queries that are cumbersome compared to Gremlin. I migrated to TitanDB which is automatically sync'd with ElasticSearch and since it was acquired from Datastax I migrated to Datastax DSE Graph which is sync'd with Apache Solr.. if you need something custom you may approach it with a queue like Apache Kafka or RabbitMQ that each store (e.g your database and elastic search and caches) synchronizes against it. Hope this helps!

Of course since you originally asked the question we have released our own graph capability: https://m.youtube.com/watch?v=1QwmJ_FCMqU