Currently I've running 2 elasticsearch servers in one cluster and started another web server which has an client node in the cluster as well.
Then I started a map reduce operation which insert couple of thousands recodes in to the cluster and I found the process often got the connect time out error
and the process failed. To solve the problem we need to restart elasticsearch servers.
I wander if someone can share some experience of elasticsearch in batch update process. Is there any limits? How can we avoid those problem?
By the way our process just grab a Node Client when it start and close it at the end of the process. And our web server will always keep the client. Is that the correct usage?
Currently I've running 2 elasticsearch servers in one cluster and started another web server which has an client node in the cluster as well.
Then I started a map reduce operation which insert couple of thousands recodes in to the cluster and I found the process often got the connect time out error
and the process failed. To solve the problem we need to restart elasticsearch servers.
I wander if someone can share some experience of elasticsearch in batch update process. Is there any limits? How can we avoid those problem?
By the way our process just grab a Node Client when it start and close it at the end of the process. And our web server will always keep the client. Is that the correct usage?
Yes, I use the client only nodes but in mapreduce process that will be many clients been created. The process is success after I restarted each severs elasticsearch cluster and reran it. But it seems some reliability problem in the elasticsearch cluster if I want run the process everyday. It seems after couple of times process it will have problem to let the client node to join the cluster because they can't connect to the servers.
I'm running two small instances in the amazon EC2 as cluster and there are about 10 indices, about 15000 records in 55 shards. Is there any config I need to set?
Regards,
Bruce Zhou
On 18/06/2012, at 12:56 PM, David Pilato wrote:
Is your batch node a "client only" node ? It must not handle data.
Currently I've running 2 elasticsearch servers in one cluster and started another web server which has an client node in the cluster as well.
Then I started a map reduce operation which insert couple of thousands recodes in to the cluster and I found the process often got the connect time out error
and the process failed. To solve the problem we need to restart elasticsearch servers.
I wander if someone can share some experience of elasticsearch in batch update process. Is there any limits? How can we avoid those problem?
By the way our process just grab a Node Client when it start and close it at the end of the process. And our web server will always keep the client. Is that the correct usage?
Yes, I use the client only nodes but in mapreduce process that will be many clients been created. The process is success after I restarted each severs elasticsearch cluster and reran it. But it seems some reliability problem in the elasticsearch cluster if I want run the process everyday. It seems after couple of times process it will have problem to let the client node to join the cluster because they can't connect to the servers.
I'm running two small instances in the amazon EC2 as cluster and there are about 10 indices, about 15000 records in 55 shards. Is there any config I need to set?
Regards,
Bruce Zhou
On 18/06/2012, at 12:56 PM, David Pilato wrote:
Is your batch node a "client only" node ? It must not handle data.
Currently I've running 2 elasticsearch servers in one cluster and started another web server which has an client node in the cluster as well.
Then I started a map reduce operation which insert couple of thousands recodes in to the cluster and I found the process often got the connect time out error
and the process failed. To solve the problem we need to restart elasticsearch servers.
I wander if someone can share some experience of elasticsearch in batch update process. Is there any limits? How can we avoid those problem?
By the way our process just grab a Node Client when it start and close it at the end of the process. And our web server will always keep the client. Is that the correct usage?
I have 2 server and 1 batch, But the batch is a hadoop mapreduce process which started many clients. We have 6 map on 2 instances so it started 6 clients.
On 18/06/2012, at 1:57 PM, David Pilato wrote:
Something I don't understand. How many nodes, how many clients do you start ?
You have 2 nodes and 1 batch ?
In your batch, you should have only one node (only one client).
Yes, I use the client only nodes but in mapreduce process that will be many clients been created. The process is success after I restarted each severs elasticsearch cluster and reran it. But it seems some reliability problem in the elasticsearch cluster if I want run the process everyday. It seems after couple of times process it will have problem to let the client node to join the cluster because they can't connect to the servers.
I'm running two small instances in the amazon EC2 as cluster and there are about 10 indices, about 15000 records in 55 shards. Is there any config I need to set?
Regards,
Bruce Zhou
On 18/06/2012, at 12:56 PM, David Pilato wrote:
Is your batch node a "client only" node ? It must not handle data.
Currently I've running 2 elasticsearch servers in one cluster and started another web server which has an client node in the cluster as well.
Then I started a map reduce operation which insert couple of thousands recodes in to the cluster and I found the process often got the connect time out error
and the process failed. To solve the problem we need to restart elasticsearch servers.
I wander if someone can share some experience of elasticsearch in batch update process. Is there any limits? How can we avoid those problem?
By the way our process just grab a Node Client when it start and close it at the end of the process. And our web server will always keep the client. Is that the correct usage?
I have 2 server and 1 batch, But the batch is a hadoop mapreduce process which started many clients. We have 6 map on 2 instances so it started 6 clients.
On 18/06/2012, at 1:57 PM, David Pilato wrote:
Something I don't understand. How many nodes, how many clients do you start ?
You have 2 nodes and 1 batch ?
In your batch, you should have only one node (only one client).
Yes, I use the client only nodes but in mapreduce process that will be many clients been created. The process is success after I restarted each severs elasticsearch cluster and reran it. But it seems some reliability problem in the elasticsearch cluster if I want run the process everyday. It seems after couple of times process it will have problem to let the client node to join the cluster because they can't connect to the servers.
I'm running two small instances in the amazon EC2 as cluster and there are about 10 indices, about 15000 records in 55 shards. Is there any config I need to set?
Regards,
Bruce Zhou
On 18/06/2012, at 12:56 PM, David Pilato wrote:
Is your batch node a "client only" node ? It must not handle data.
Currently I've running 2 elasticsearch servers in one cluster and started another web server which has an client node in the cluster as well.
Then I started a map reduce operation which insert couple of thousands recodes in to the cluster and I found the process often got the connect time out error
and the process failed. To solve the problem we need to restart elasticsearch servers.
I wander if someone can share some experience of elasticsearch in batch update process. Is there any limits? How can we avoid those problem?
By the way our process just grab a Node Client when it start and close it at the end of the process. And our web server will always keep the client. Is that the correct usage?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.