Poor performance when clients and ES cluster not in the same LAN


(mihaela) #1

Hi,

We have an ElasticSearch cluster that has clients that aren't in the
same LAN. Because of this there is a considerate delay until the
client receives the response.
The index we have uses the default file system based storage. Is it a
mechanism of caching the results of the queries (not filters) . I
think this may improve the performance.

Another option would be having the clients and the cluster in the same
LAN. For this is it possible to take the index from our cluster (the
index directories created) and move it on another ES cluster that will
be able to automatically detect the index? I think this is a nice
feature given the fact that Lucene index directory is platform
independent. So if one needs the index on another cluster instead of
scanning the old index and reindexing it on the new cluster, it only
has to copy some index directories on the new cluster. Although even
this approach isn't perfect because we have to gather the complete
index structure from all the machines in the cluster.

Any suggestions will be appreciated.


(Shay Banon) #2

Is the performance acceptable within the LAN? If so, then its not really an
elasticsearch problem, just that the connection you have is slow, no?

On Mon, Oct 24, 2011 at 5:12 PM, Mihaela olteanu.miha@gmail.com wrote:

Hi,

We have an ElasticSearch cluster that has clients that aren't in the
same LAN. Because of this there is a considerate delay until the
client receives the response.
The index we have uses the default file system based storage. Is it a
mechanism of caching the results of the queries (not filters) . I
think this may improve the performance.

Another option would be having the clients and the cluster in the same
LAN. For this is it possible to take the index from our cluster (the
index directories created) and move it on another ES cluster that will
be able to automatically detect the index? I think this is a nice
feature given the fact that Lucene index directory is platform
independent. So if one needs the index on another cluster instead of
scanning the old index and reindexing it on the new cluster, it only
has to copy some index directories on the new cluster. Although even
this approach isn't perfect because we have to gather the complete
index structure from all the machines in the cluster.

Any suggestions will be appreciated.


(mihaela) #3

Yes, the performance in the LAN is acceptable. I think that indeed the
connection is the problem, because is too slow. Anyway, that's why, for this
case I need a way to export one index from one cluster to another, given the
fact that I cannot index directly on the cluster where is the client (I
cannot give more details) and I don't want to scan the index and reindex it
on the new cluster.

Also I have another questions: which is the best way of a remote client
connecting to an elasticsearch cluster: the TransportClient or the REST API?
If one uses the TransportClient will be the connections kept opened all the
session, is this a good idea?, is there a connection pool? how many
connections it will be, one for each node in the cluster if I add them all
to the list of transport addresses?

2011/10/24 Shay Banon kimchy@gmail.com

Is the performance acceptable within the LAN? If so, then its not really an
elasticsearch problem, just that the connection you have is slow, no?

On Mon, Oct 24, 2011 at 5:12 PM, Mihaela olteanu.miha@gmail.com wrote:

Hi,

We have an ElasticSearch cluster that has clients that aren't in the
same LAN. Because of this there is a considerate delay until the
client receives the response.
The index we have uses the default file system based storage. Is it a
mechanism of caching the results of the queries (not filters) . I
think this may improve the performance.

Another option would be having the clients and the cluster in the same
LAN. For this is it possible to take the index from our cluster (the
index directories created) and move it on another ES cluster that will
be able to automatically detect the index? I think this is a nice
feature given the fact that Lucene index directory is platform
independent. So if one needs the index on another cluster instead of
scanning the old index and reindexing it on the new cluster, it only
has to copy some index directories on the new cluster. Although even
this approach isn't perfect because we have to gather the complete
index structure from all the machines in the cluster.

Any suggestions will be appreciated.


(Shay Banon) #4

On Tue, Oct 25, 2011 at 3:47 AM, Mihaela Olteanu olteanu.miha@gmail.comwrote:

Yes, the performance in the LAN is acceptable. I think that indeed the
connection is the problem, because is too slow. Anyway, that's why, for this
case I need a way to export one index from one cluster to another, given the
fact that I cannot index directly on the cluster where is the client (I
cannot give more details) and I don't want to scan the index and reindex it
on the new cluster.

You can use the scan API to index data from one index to another.

Also I have another questions: which is the best way of a remote client
connecting to an elasticsearch cluster: the TransportClient or the REST API?
If one uses the TransportClient will be the connections kept opened all the
session, is this a good idea?, is there a connection pool? how many
connections it will be, one for each node in the cluster if I add them all
to the list of transport addresses?

The TransportClient manages its own connections, and yes, keeps them open.

2011/10/24 Shay Banon kimchy@gmail.com

Is the performance acceptable within the LAN? If so, then its not really
an elasticsearch problem, just that the connection you have is slow, no?

On Mon, Oct 24, 2011 at 5:12 PM, Mihaela olteanu.miha@gmail.com wrote:

Hi,

We have an ElasticSearch cluster that has clients that aren't in the
same LAN. Because of this there is a considerate delay until the
client receives the response.
The index we have uses the default file system based storage. Is it a
mechanism of caching the results of the queries (not filters) . I
think this may improve the performance.

Another option would be having the clients and the cluster in the same
LAN. For this is it possible to take the index from our cluster (the
index directories created) and move it on another ES cluster that will
be able to automatically detect the index? I think this is a nice
feature given the fact that Lucene index directory is platform
independent. So if one needs the index on another cluster instead of
scanning the old index and reindexing it on the new cluster, it only
has to copy some index directories on the new cluster. Although even
this approach isn't perfect because we have to gather the complete
index structure from all the machines in the cluster.

Any suggestions will be appreciated.


(Marian) #5

Hello,

I have question regarding TransportClient. How many connections does the TransportClient manage? Can I set up the number of connections?

I am thinking if it is reasonable to create my own TransportClientConnectionPool or it is sufficient to leave it on TransportClient. Due to performance issue.

Thanks for the answer.


(Shay Banon) #6

There is no need for a connection pool with TransportClient, it can safely (and optimally) be used by multiple threads concurrently.

On Tuesday, February 28, 2012 at 11:53 AM, Marian wrote:

Hello,

I have question regarding TransportClient. How many connections does the
TransportClient manage? Can I set up the number of connections?

I am thinking if it is reasonable to create my own
TransportClientConnectionPool or it is sufficient to leave it on
TransportClient. Due to performance issue.

Thanks for the answer.

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Poor-performance-when-clients-and-ES-cluster-not-in-the-same-LAN-tp3448416p3783537.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com (http://Nabble.com).


(system) #7