TransportClient versus NodeClient

I have a web app with an embedded ES server (node) running. This same web
application is deployed on various application servers, and these ES nodes
discover each other to form a cluster.

In my web application, I have an ES client start and connect to the ES node
running on localhost. I do not want to ask the embedded server for a client
handle, so I have two options.

a) Start up a non-data ES node client which will auto-discover the other
nodes on the network, or
b) Start up a Transport client and tell it to connect to the node running on
localhost (with sniffing).

From what I have read, I believe the weaknesses of each of these approaches
are:
a) A non-data ES node seems like a more performant client, but perhaps more
chatty and resource intensive.
b) A transport client seems resource efficient, but not the most performant
solution. It also requires more intensive configuration.

I have done both of these and they both work, but I think I want something
that is in the middle.

Are my impressions of the Node Client correct? Is it much chattier than the
Transport client? If so, will my scalability be hampered because of this
additional overhead? I really like the fact that I don't have to tell it the
address of other nodes in my cluster.

Is there a way to get the Transport client to auto-discover at least the
same node running in its JVM? Is it truly less resource intensive than the
Node Client?

On Thu, Jul 21, 2011 at 7:32 PM, James Cook jcook@tracermedia.com wrote:

I have a web app with an embedded ES server (node) running. This same web
application is deployed on various application servers, and these ES nodes
discover each other to form a cluster.

In my web application, I have an ES client start and connect to the ES node
running on localhost. I do not want to ask the embedded server for a client
handle, so I have two options.

a) Start up a non-data ES node client which will auto-discover the other
nodes on the network, or
b) Start up a Transport client and tell it to connect to the node running
on localhost (with sniffing).

From what I have read, I believe the weaknesses of each of these approaches
are:
a) A non-data ES node seems like a more performant client, but perhaps more
chatty and resource intensive.

Make sure to set it as client (clien(true)), not just non data node.

b) A transport client seems resource efficient, but not the most performant
solution. It also requires more intensive configuration.

I have done both of these and they both work, but I think I want something
that is in the middle.

Are my impressions of the Node Client correct? Is it much chattier than the
Transport client? If so, will my scalability be hampered because of this
additional overhead? I really like the fact that I don't have to tell it the
address of other nodes in my cluster.

Is there a way to get the Transport client to auto-discover at least the
same node running in its JVM? Is it truly less resource intensive than the
Node Client?

The difference between the two is that a node client actually connects to
the cluster, and receives cluster state changes so it can execute requests
in a more optimal manner. I would say use a node client.