The "double hop" is just another word for the extra network transport step
between cluster and TransportClient. The request and the response need to
travel over the wire to the requesting client. NodeClient does not require
this network transport - this may be a benefit. For NodeClient and
TransportClient, operations are automatically routed, there is no different
behavior.
To me, because I am only using ES from remote JVMs, TransportClient is
preferable, while other prefer NodeClient and work locally. It is a matter
of taste in the end.
What I don't understand is the motivation for shard-based access. It is
quite unusual because the combination of distributed shards into an index
and performing search on index level is one of the main features and one of
the strengths of ES.
Many features of ES rely on strict index level access, like searching. If
shard-exclusive access was allowed, you would have to reconsider the search
operation also for single shards only. These shard operations do exist in
the ES codebase, but they are low-level and hidden from the public API
because they are sensible to touch (and, the Lucene API is fluctuating from
version to version in that area).
On the shard level, clients are able to access Lucene index engine
directly, but are not be able to access the ES index API level with all the
mapping information of fields in documents, alias etc.
A possible method to expose the inner shard-based functions is passing
parameters to custom implementations of index actions. For example,
bypassing the internal shard number calculation by controlling this number
by a parameter from outside via public API. This may conflict with the
current concept of doc routing, so implementing a new, simplified index
action could be required. I think it is possible to implement an ES plugin
for this kind of experiments.
Jörg
On Sat, Sep 14, 2013 at 11:51 PM, peter@vagaband.co peter@vagaband.cowrote:
Wow... if that is true, then I guess the documentation and what ES folks
teach at their training is wrong, or at least misleading. Their insistence
on preferring node client over transport client was solely based on the
fact that it's a smarter client.
From the documentation -
"The benefit of using the Client is the fact that operations are
automatically routed to the node(s) the operations need to be executed on,
without performing a “double hop”. For example, the index operation will
automatically be executed on the shard that it will end up existing at."
Is that wrong?
It is indeed a sad day if we don't allow for smarter client which are
capable of connecting to the right node to perform that operation in the
most efficient way. I understand that this is tricky, that number of nodes,
the indices, the shards can change (as node come online/offline, as new
indices come and go, as replicas are added and removed, etc). But there's
no reason why it cannot be done (even if it's not available or exposed
now). As information changes, it can be communicated to clients and clients
can adjust or reconnect as necessary.
Lot of distributed systems vendors have come to the realization that lot
of problems inherent with distributed systems can be solved by way of
building smarter clients. This is evident based on talks given at
conferences for distributed systems. Hopefully, ES team looks at it the
same way. Obviously building something like this can be done in a clean way
and without hindering other features. The main reason why I like
Elasticsearch over SOLR is that ES folks (Shay in particular) seems to be
doing everything right in terms of putting the right level of intelligence
in the right place. A lot of thought has been put in to building ES that is
very nuanced but also complete in terms of the big picture.
And I hope a smarter client is one of the things on their roadmap.
Thank you for shedding light on this, Jörg.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.