Rest api or java client?

hi all,

i'm new to elastic search and would like to ask some basic questions.

we are developing a system based on the play framework (non blocking io,
event loop, scala)

we are currently working with elastic search through the rest api which is
working ok in dev. we are concerned about performance once we move to
production environment. here are some questions:

  1. can i point the rest api end point to a load balancer configured in
    front of the ES cluster? is that a common best practice?

  2. is there any performance boost if we switch from rest api calls to
    native java client? if so - is it lagging behind with features?

  3. java client - is this a smart client? meaning - can the client direct
    the queries to the relevant shard / shards for faster result retrieval?

  4. any other advice / suggestion in regards to native client vs REST API
    for using ES?

thanks!
CB

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

  1. No. ES is already managing connections, see TransportClient

  2. REST API sits on top of native Java client. So, because of HTTP, you
    have overhead with REST. Async call API with HTTP is a mess.

  3. All actions are routed automatically to the relevant shards only, no
    matter what client.

  4. There are scala clients out there like elastic4s that wrap the native
    Java API, so I wonder why you do not use them?

Jörg

On Fri, Jul 25, 2014 at 8:25 AM, CB chen.bekor@gmail.com wrote:

hi all,

i'm new to Elasticsearch and would like to ask some basic questions.

we are developing a system based on the play framework (non blocking io,
event loop, scala)

we are currently working with Elasticsearch through the rest api which is
working ok in dev. we are concerned about performance once we move to
production environment. here are some questions:

  1. can i point the rest api end point to a load balancer configured in
    front of the ES cluster? is that a common best practice?

  2. is there any performance boost if we switch from rest api calls to
    native java client? if so - is it lagging behind with features?

  3. java client - is this a smart client? meaning - can the client direct
    the queries to the relevant shard / shards for faster result retrieval?

  4. any other advice / suggestion in regards to native client vs REST API
    for using ES?

thanks!
CB

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHtVGgprA6gAGb81%2BdX7CLdWzq3ZgYYvT7c80nVnBro_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

  1. All actions are routed automatically to the relevant shards only, no matter what client.

Just a comment about this. If you are using a TransportClient, the transport client won't try to reach directly the right shard. It will simply direct the request to a node which is one of the nodes it knows about (list of nodes you provided with addTransportAddress() method). The NodeClient will do that for sure.

See http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/client.html#transport-client

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 25 juillet 2014 à 09:59:38, joergprante@gmail.com (joergprante@gmail.com) a écrit:

  1. No. ES is already managing connections, see TransportClient

  2. REST API sits on top of native Java client. So, because of HTTP, you have overhead with REST. Async call API with HTTP is a mess.

  3. All actions are routed automatically to the relevant shards only, no matter what client.

  4. There are scala clients out there like elastic4s that wrap the native Java API, so I wonder why you do not use them?

Jörg

On Fri, Jul 25, 2014 at 8:25 AM, CB chen.bekor@gmail.com wrote:
hi all,

i'm new to elastic search and would like to ask some basic questions.

we are developing a system based on the play framework (non blocking io, event loop, scala)

we are currently working with elastic search through the rest api which is working ok in dev. we are concerned about performance once we move to production environment. here are some questions:

  1. can i point the rest api end point to a load balancer configured in front of the ES cluster? is that a common best practice?

  2. is there any performance boost if we switch from rest api calls to native java client? if so - is it lagging behind with features?

  3. java client - is this a smart client? meaning - can the client direct the queries to the relevant shard / shards for faster result retrieval?

  4. any other advice / suggestion in regards to native client vs REST API for using ES?

thanks!
CB

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHtVGgprA6gAGb81%2BdX7CLdWzq3ZgYYvT7c80nVnBro_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.53d22551.520eedd1.13e40%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.

thanks for the answers, here are my thoughts:

  1. If using pure REST client - Using a Load Balancer will make sure that
    the endpoint address goes to any of the "live" nodes (round robin) so that
    if one of those nodes "dies" or if I scale out the cluster (add more nodes)
    it is transparent to the client. Does that make sense?

  2. Jörg - can you please provide more details / link explaining about why
    and how the "REST API sits on top a Java Client"

  3. The java client is fine but the documentation of the actual query API is
    pretty basic and will always send you to the REST documentation. I found it
    hard to "translate" the REST API docs to native java client APIs

elastic4s seems very promising, although not sure it supports scala 2.11. I
might give it a spin - thanks for the tip :wink:

BTW - Do you know if the java client is using a binary protocol ? that
might become a big advantage over REST for large query results..

On Friday, July 25, 2014 10:59:43 AM UTC+3, Jörg Prante wrote:

  1. No. ES is already managing connections, see TransportClient

  2. REST API sits on top of native Java client. So, because of HTTP, you
    have overhead with REST. Async call API with HTTP is a mess.

  3. All actions are routed automatically to the relevant shards only, no
    matter what client.

  4. There are scala clients out there like elastic4s that wrap the native
    Java API, so I wonder why you do not use them?

Jörg

On Fri, Jul 25, 2014 at 8:25 AM, CB <chen....@gmail.com <javascript:>>
wrote:

hi all,

i'm new to Elasticsearch and would like to ask some basic questions.

we are developing a system based on the play framework (non blocking io,
event loop, scala)

we are currently working with Elasticsearch through the rest api which
is working ok in dev. we are concerned about performance once we move to
production environment. here are some questions:

  1. can i point the rest api end point to a load balancer configured in
    front of the ES cluster? is that a common best practice?

  2. is there any performance boost if we switch from rest api calls to
    native java client? if so - is it lagging behind with features?

  3. java client - is this a smart client? meaning - can the client direct
    the queries to the relevant shard / shards for faster result retrieval?

  4. any other advice / suggestion in regards to native client vs REST API
    for using ES?

thanks!
CB

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/11700659-529a-4d7b-ad6a-430835e2b790%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Answers inline.

On Fri, Jul 25, 2014 at 3:06 AM, CB chen.bekor@gmail.com wrote:

thanks for the answers, here are my thoughts:

  1. If using pure REST client - Using a Load Balancer will make sure that
    the endpoint address goes to any of the "live" nodes (round robin) so that
    if one of those nodes "dies" or if I scale out the cluster (add more nodes)
    it is transparent to the client. Does that make sense?

I do not use the REST client, but I would assume the use of a load balancer
would depend on the client library. With the Java TransportClient, you
simply provide a list of valid nodes and the option to discover other nodes
based on those nodes (client.transport.sniff). If the REST client library
has the same functionality, then there is no need for a load balancer, but
it could simplify things to simply point to a load balancer.

  1. Jörg - can you please provide more details / link explaining about why
    and how the "REST API sits on top a Java Client"

The Java API is the "true" API for Elasticsearch. The REST API is simply a
wrapper around the Java API. The Java API is therefore always feature
complete, while potentially the REST API might not expose everything. Take
a look at the various Rest*Action classes such as RestSearchAction. You
will see that basically the REST call gets transformed into a call using
the Java API.

  1. The java client is fine but the documentation of the actual query API
    is pretty basic and will always send you to the REST documentation. I found
    it hard to "translate" the REST API docs to native java client APIs

The Java documentation is indeed lacking. I believe David has a better
write somewhere, but I always refer to the actual code for detailed usage
of the API. You can look at both the aforementioned Rest*Action class or
simply the many unit tests for concrete end-to-end examples.

elastic4s seems very promising, although not sure it supports scala 2.11.
I might give it a spin - thanks for the tip :wink:

BTW - Do you know if the java client is using a binary protocol ? that
might become a big advantage over REST for large query results..

The Java Client is indeed binary and will have many advantages over REST.
However, serialization issues between versions can occur, but the issue has
almost gone away since the 1.x release. You still might have issues with
newer clients accessing older servers.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDar92Lu_Y6WMkyzrRQ%3DVp3goGwDda1O74fq-Fsr41TDg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

thanks Ivan!

On Friday, July 25, 2014 9:31:40 PM UTC+3, Ivan Brusic wrote:

Answers inline.

On Fri, Jul 25, 2014 at 3:06 AM, CB <chen....@gmail.com <javascript:>>
wrote:

thanks for the answers, here are my thoughts:

  1. If using pure REST client - Using a Load Balancer will make sure that
    the endpoint address goes to any of the "live" nodes (round robin) so that
    if one of those nodes "dies" or if I scale out the cluster (add more nodes)
    it is transparent to the client. Does that make sense?

I do not use the REST client, but I would assume the use of a load
balancer would depend on the client library. With the Java TransportClient,
you simply provide a list of valid nodes and the option to discover other
nodes based on those nodes (client.transport.sniff). If the REST client
library has the same functionality, then there is no need for a load
balancer, but it could simplify things to simply point to a load balancer.

  1. Jörg - can you please provide more details / link explaining about why
    and how the "REST API sits on top a Java Client"

The Java API is the "true" API for Elasticsearch. The REST API is simply a
wrapper around the Java API. The Java API is therefore always feature
complete, while potentially the REST API might not expose everything. Take
a look at the various Rest*Action classes such as RestSearchAction. You
will see that basically the REST call gets transformed into a call using
the Java API.

  1. The java client is fine but the documentation of the actual query API
    is pretty basic and will always send you to the REST documentation. I found
    it hard to "translate" the REST API docs to native java client APIs

The Java documentation is indeed lacking. I believe David has a better
write somewhere, but I always refer to the actual code for detailed usage
of the API. You can look at both the aforementioned Rest*Action class or
simply the many unit tests for concrete end-to-end examples.

elastic4s seems very promising, although not sure it supports scala 2.11.
I might give it a spin - thanks for the tip :wink:

BTW - Do you know if the java client is using a binary protocol ? that
might become a big advantage over REST for large query results..

The Java Client is indeed binary and will have many advantages over REST.
However, serialization issues between versions can occur, but the issue has
almost gone away since the 1.x release. You still might have issues with
newer clients accessing older servers.

Cheers,

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba32f62c-fef9-48a0-982e-d8d5ca3ea35c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It does support 2.11 of course.

And about the Java client documentation - one more reason to use the Scala
DSL in Elastic4s as you'll get code completion.

For example you can do this

search in "places"->"cities" query "paris" start 5 limit 10 and each step
of the way the DSL will let you know what's applicable for the syntax.

On Friday, July 25, 2014 11:06:30 AM UTC+1, CB wrote:

thanks for the answers, here are my thoughts:

  1. If using pure REST client - Using a Load Balancer will make sure that
    the endpoint address goes to any of the "live" nodes (round robin) so that
    if one of those nodes "dies" or if I scale out the cluster (add more nodes)
    it is transparent to the client. Does that make sense?

  2. Jörg - can you please provide more details / link explaining about why
    and how the "REST API sits on top a Java Client"

  3. The java client is fine but the documentation of the actual query API
    is pretty basic and will always send you to the REST documentation. I found
    it hard to "translate" the REST API docs to native java client APIs

elastic4s seems very promising, although not sure it supports scala 2.11.
I might give it a spin - thanks for the tip :wink:

BTW - Do you know if the java client is using a binary protocol ? that
might become a big advantage over REST for large query results..

On Friday, July 25, 2014 10:59:43 AM UTC+3, Jörg Prante wrote:

  1. No. ES is already managing connections, see TransportClient

  2. REST API sits on top of native Java client. So, because of HTTP, you
    have overhead with REST. Async call API with HTTP is a mess.

  3. All actions are routed automatically to the relevant shards only, no
    matter what client.

  4. There are scala clients out there like elastic4s that wrap the native
    Java API, so I wonder why you do not use them?

Jörg

On Fri, Jul 25, 2014 at 8:25 AM, CB chen....@gmail.com wrote:

hi all,

i'm new to Elasticsearch and would like to ask some basic questions.

we are developing a system based on the play framework (non blocking io,
event loop, scala)

we are currently working with Elasticsearch through the rest api which
is working ok in dev. we are concerned about performance once we move to
production environment. here are some questions:

  1. can i point the rest api end point to a load balancer configured in
    front of the ES cluster? is that a common best practice?

  2. is there any performance boost if we switch from rest api calls to
    native java client? if so - is it lagging behind with features?

  3. java client - is this a smart client? meaning - can the client direct
    the queries to the relevant shard / shards for faster result retrieval?

  4. any other advice / suggestion in regards to native client vs REST API
    for using ES?

thanks!
CB

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/3ca59232-8462-4e66-8400-8a5aca18fe0c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/782eb968-5e09-452f-8c91-004b997f8f04%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.