ElasticSearch synchronization with OrientDB


(Michalis Michaelidis) #1

Hello,

I would like some guidelines about how to approach ElasticSearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one:
    https://github.com/sksamuel/elasticsearch-river-neo4j

In this google group discussion
(https://groups.google.com/d/msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ)
someone said that it could be done using hooks api of OrientDB that is more
efficient than just pooling. Any thoughts on that?

  1. JDBC River plugin ( https://github.com/jprante/elasticsearch-river-jdbc)
  • I could just use this one since OrientDB is providing a JDBC driver.
    Could there be compatibility problems?

A person suggested Single point of failure problems with river plugins and
I read that river plugins could be deprecated (?) in the future
(http://stackoverflow.com/questions/22237111/preferred-method-of-indexing-bulk-data-into-elasticsearch).
I don't know if SPOF is actually a reality as I see river plugins used for
many types of resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as it
    was done with Lucene https://github.com/orientechnologies/orientdb-lucene.
    Could that have scalability problems for either OrientDB or Elastic Search.
    I guess this cannot take the full advantage of Elastic Search and it is
    using that for querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

You can also DIY and leverage the official clients.

Be aware that in the long run that rivers are being deprecated.

On 25 February 2015 at 06:25, Michalis Michaelidis mmichaelid@gmail.com
wrote:

Hello,

I would like some guidelines about how to approach ElasticSearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one:
    https://github.com/sksamuel/elasticsearch-river-neo4j

In this google group discussion (
https://groups.google.com/d/msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ)
someone said that it could be done using hooks api of OrientDB that is more
efficient than just pooling. Any thoughts on that?

  1. JDBC River plugin ( https://github.com/jprante/elasticsearch-river-jdbc)
  • I could just use this one since OrientDB is providing a JDBC driver.
    Could there be compatibility problems?

A person suggested Single point of failure problems with river plugins and
I read that river plugins could be deprecated (?) in the future (
http://stackoverflow.com/questions/22237111/preferred-method-of-indexing-bulk-data-into-elasticsearch).
I don't know if SPOF is actually a reality as I see river plugins used for
many types of resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as it
    was done with Lucene https://github.com/orientechnologies/orientdb-lucene.
    Could that have scalability problems for either OrientDB or Elastic Search.
    I guess this cannot take the full advantage of Elastic Search and it is
    using that for querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-b3POi5eDRBufLVYvXN4odvMn%3DfX27rbv-RJPu7J3mmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Michalis Michaelidis) #3

Thank you,

I have that in mind. What do you mean with official clients? I don't think
it is a good idea to hit both orientdb and elasticsearch when I am
inserting something for example..

Τη Τρίτη, 24 Φεβρουαρίου 2015 - 3:16:20 μ.μ. UTC-5, ο χρήστης Mark Walkom
έγραψε:

You can also DIY and leverage the official clients.

Be aware that in the long run that rivers are being deprecated.

On 25 February 2015 at 06:25, Michalis Michaelidis <mmich...@gmail.com
<javascript:>> wrote:

Hello,

I would like some guidelines about how to approach ElasticSearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one:
    https://github.com/sksamuel/elasticsearch-river-neo4j

In this google group discussion (
https://groups.google.com/d/msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ)
someone said that it could be done using hooks api of OrientDB that is more
efficient than just pooling. Any thoughts on that?

  1. JDBC River plugin (
    https://github.com/jprante/elasticsearch-river-jdbc) - I could just use
    this one since OrientDB is providing a JDBC driver. Could there be
    compatibility problems?

A person suggested Single point of failure problems with river plugins
and I read that river plugins could be deprecated (?) in the future (
http://stackoverflow.com/questions/22237111/preferred-method-of-indexing-bulk-data-into-elasticsearch).
I don't know if SPOF is actually a reality as I see river plugins used for
many types of resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as
    it was done with Lucene
    https://github.com/orientechnologies/orientdb-lucene. Could that have
    scalability problems for either OrientDB or Elastic Search. I guess this
    cannot take the full advantage of Elastic Search and it is using that for
    querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #4

Official clients are listed here - http://www.elasticsearch.org/guide/

On 26 February 2015 at 12:09, Michalis Michaelidis mmichaelid@gmail.com
wrote:

Thank you,

I have that in mind. What do you mean with official clients? I don't
think it is a good idea to hit both orientdb and elasticsearch when I am
inserting something for example..

Τη Τρίτη, 24 Φεβρουαρίου 2015 - 3:16:20 μ.μ. UTC-5, ο χρήστης Mark Walkom
έγραψε:

You can also DIY and leverage the official clients.

Be aware that in the long run that rivers are being deprecated.

On 25 February 2015 at 06:25, Michalis Michaelidis mmich...@gmail.com
wrote:

Hello,

I would like some guidelines about how to approach ElasticSearch
synchronization with OrientDB:

Doing some search I have found those approaches:

  1. Dedicated River plugin - Like this one: https://github.com/sksamuel/
    elasticsearch-river-neo4j

In this google group discussion (https://groups.google.com/d/
msg/orient-database/YAesdS2qAYc/yCp7v9pF6tcJ) someone said that it
could be done using hooks api of OrientDB that is more efficient than just
pooling. Any thoughts on that?

  1. JDBC River plugin ( https://github.com/jprante/
    elasticsearch-river-jdbc) - I could just use this one since OrientDB is
    providing a JDBC driver. Could there be compatibility problems?

A person suggested Single point of failure problems with river plugins
and I read that river plugins could be deprecated (?) in the future (
http://stackoverflow.com/questions/22237111/preferred-
method-of-indexing-bulk-data-into-elasticsearch). I don't know if SPOF
is actually a reality as I see river plugins used for many types of
resources that seem very decoupled.

  1. Someone in twitter suggested embedding Elastic Search in OrientDB as
    it was done with Lucene https://github.com/orientechnologies/orientdb-
    lucene. Could that have scalability problems for either OrientDB or
    Elastic Search. I guess this cannot take the full advantage of Elastic
    Search and it is using that for querying only..

Please guide me if I need to implement myself something or I could use
existing tools and what are the tradeoffs of the previous or other
approaches. I could have the support from Orient Technologies if I need it.

Thank you,

Michail

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/66a17298-5c32-46a4-80a0-0a1ccc5fb465%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e0b55ff9-0c3c-4473-b701-26de2ed3b85c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9zDaL-_dmQVOxEjojnkB_6VHOW_pWO3mK6i-XCoLscyQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Shashank Gupta) #5

@Michalis_Michaelidis Have you found out any solution? Need help!


(Michalis Michaelidis) #6

No I didn't find anything.. Actually I moved away from OrientDB due to so many bugs that exist in their code base and SQL graph queries that are cumbersome compared to Gremlin. I migrated to TitanDB which is automatically sync'd with ElasticSearch and since it was acquired from Datastax I migrated to Datastax DSE Graph which is sync'd with Apache Solr.. if you need something custom you may approach it with a queue like Apache Kafka or RabbitMQ that each store (e.g your database and elastic search and caches) synchronizes against it. Hope this helps!


(Mark Harwood) #7

Of course since you originally asked the question we have released our own graph capability: https://m.youtube.com/watch?v=1QwmJ_FCMqU


(system) #8