I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
I wonder if this would explain the strange hanging behavior I've been seeing
intermittently when trying to index documents. I can make it go away by
blowing away the data repo and starting over, and it may not reoccur for a
long time. The next time it happens, I'll start investigating the node
endpoints to see if it's happening while a node is dead.
-- Eric
On Wed, Jun 8, 2011 at 7:02 PM, Derek Wollenstein derek@klout.com wrote:
I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
Its not really indefinite, I am not sure why it hangs, but eventually it will fail on closed socket. Or, if it tries to connect to a broken socket, under some OS, it will take time to identify that the socket is not there to connect to (by default, it has 30s timeout). Do you really see it as indefinite?
On Thursday, June 9, 2011 at 2:02 AM, Derek Wollenstein wrote:
I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
On Thursday, June 9, 2011 at 4:36 AM, Eric Mill wrote:
I wonder if this would explain the strange hanging behavior I've been seeing intermittently when trying to index documents. I can make it go away by blowing away the data repo and starting over, and it may not reoccur for a long time. The next time it happens, I'll start investigating the node endpoints to see if it's happening while a node is dead.
I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
When I say "indefinite", what I mean is that I wrote some code
that does the following (without the rest of the class):
Settings settings =
ImmutableSettings.settingsBuilder().put("cluster.name", clusterName).
.put(NetworkService.TcpSettings.TCP_CONNECT_TIMEOUT,
1).build();
TransportClient indexer = new TransportClient(settings);
String hosts = new String {"host1","host2"/* ,... */,
"hostN"};
for (int i = 0; i < hosts.length; i++) {
LOG.info(" host : " + hosts[i]);
indexer = indexer.addTransportAddress(new
InetSocketTransportAddress(hosts[i], 9300));
}
If I run this I'll see the output
host: host1
host: host2
[... pause for several minutes ...]
And then I'll abort this with Ctrl+C. I've had this run fine
depending on connectivity, but if there's any pause at all it seems to
hang forever. I didn't get anything to improve without modifying the
ScheduledConnectNodesSampler to have an internal timeout
On Thursday, June 9, 2011 at 4:36 AM, Eric Mill wrote:
I wonder if this would explain the strange hanging behavior I've been seeing intermittently when trying to index documents. I can make it go away by blowing away the data repo and starting over, and it may not reoccur for a long time. The next time it happens, I'll start investigating the node endpoints to see if it's happening while a node is dead.
I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
When you set it to 1, it means 1 millisecond, can you try and set it to "1s"? Also, when you see this hang, can you issue a thread dump, lets see where its stuck (gist it please).
On Friday, June 10, 2011 at 12:08 AM, Derek Wollenstein wrote:
Shay -
When I say "indefinite", what I mean is that I wrote some code
that does the following (without the rest of the class):
Settings settings =
ImmutableSettings.settingsBuilder().put("cluster.name (http://cluster.name)", clusterName).
.put(NetworkService.TcpSettings.TCP_CONNECT_TIMEOUT,
1).build();
TransportClient indexer = new TransportClient(settings);
String hosts = new String {"host1","host2"/* ,... */,
"hostN"};
for (int i = 0; i < hosts.length; i++) {
LOG.info (http://LOG.info)(" host : " + hosts[i]);
indexer = indexer.addTransportAddress(new
InetSocketTransportAddress(hosts[i], 9300));
}
If I run this I'll see the output
host: host1
host: host2
[... pause for several minutes ...]
And then I'll abort this with Ctrl+C. I've had this run fine
depending on connectivity, but if there's any pause at all it seems to
hang forever. I didn't get anything to improve without modifying the
ScheduledConnectNodesSampler to have an internal timeout
On Thursday, June 9, 2011 at 4:36 AM, Eric Mill wrote:
I wonder if this would explain the strange hanging behavior I've been seeing intermittently when trying to index documents. I can make it go away by blowing away the data repo and starting over, and it may not reoccur for a long time. The next time it happens, I'll start investigating the node endpoints to see if it's happening while a node is dead.
I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
Shay-
I just got this to reproduce itself using default settings. This
basically is what happens when a previously connected node disappears
when using a local gateway, and unicast discovery. I attempted to
check for the same lock being waited on after running jstack,
sleeping, and running jstack again.
I've put the apparently hung thread in
Actually I'll go ahead and updated this to be the complete thread
dump.
Threads "elasticsearch[cached]-pool-1-thread-80",
"elasticsearch[cached]-pool-1-thread-79", "elasticsearch[cached]-
pool-1-thread-78", and at least one more are exhibiting this problem.
When you set it to 1, it means 1 millisecond, can you try and set it to "1s"? Also, when you see this hang, can you issue a thread dump, lets see where its stuck (gist it please).
On Friday, June 10, 2011 at 12:08 AM, Derek Wollenstein wrote:
Shay -
When I say "indefinite", what I mean is that I wrote some code
that does the following (without the rest of the class):
Settings settings =
ImmutableSettings.settingsBuilder().put("cluster.name (http://cluster.name)", clusterName).
.put(NetworkService.TcpSettings.TCP_CONNECT_TIMEOUT,
1).build();
TransportClient indexer = new TransportClient(settings);
String hosts = new String {"host1","host2"/* ,... */,
"hostN"};
for (int i = 0; i < hosts.length; i++) {
LOG.info (http://LOG.info)(" host : " + hosts[i]);
indexer = indexer.addTransportAddress(new
InetSocketTransportAddress(hosts[i], 9300));
}
If I run this I'll see the output
host: host1
host: host2
[... pause for several minutes ...]
And then I'll abort this with Ctrl+C. I've had this run fine
depending on connectivity, but if there's any pause at all it seems to
hang forever. I didn't get anything to improve without modifying the
ScheduledConnectNodesSampler to have an internal timeout
On Thursday, June 9, 2011 at 4:36 AM, Eric Mill wrote:
I wonder if this would explain the strange hanging behavior I've been seeing intermittently when trying to index documents. I can make it go away by blowing away the data repo and starting over, and it may not reoccur for a long time. The next time it happens, I'll start investigating the node endpoints to see if it's happening while a node is dead.
I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
So, it stays hang on that lock (which basically waits for the response to come back)? Basically, when a node is identified as disconnected (on the client side), then it also "releases" all the ongoing messages that has been sent to it. How easy is it to recreate it? If it is simple, would you mind terribly jumping on IRC and I can create debug version of ES to test whats happening?
On Wednesday, June 15, 2011 at 1:23 AM, Derek Wollenstein wrote:
Shay-
I just got this to reproduce itself using default settings. This
basically is what happens when a previously connected node disappears
when using a local gateway, and unicast discovery. I attempted to
check for the same lock being waited on after running jstack,
sleeping, and running jstack again.
Actually I'll go ahead and updated this to be the complete thread
dump.
Threads "elasticsearch[cached]-pool-1-thread-80",
"elasticsearch[cached]-pool-1-thread-79", "elasticsearch[cached]-
pool-1-thread-78", and at least one more are exhibiting this problem.
When you set it to 1, it means 1 millisecond, can you try and set it to "1s"? Also, when you see this hang, can you issue a thread dump, lets see where its stuck (gist it please).
On Friday, June 10, 2011 at 12:08 AM, Derek Wollenstein wrote:
Shay -
When I say "indefinite", what I mean is that I wrote some code
that does the following (without the rest of the class):
Settings settings =
ImmutableSettings.settingsBuilder().put("cluster.name (http://cluster.name)", clusterName).
.put(NetworkService.TcpSettings.TCP_CONNECT_TIMEOUT,
1).build();
TransportClient indexer = new TransportClient(settings);
String hosts = new String {"host1","host2"/* ,... */,
"hostN"};
for (int i = 0; i < hosts.length; i++) {
LOG.info (http://LOG.info)(" host : " + hosts[i]);
indexer = indexer.addTransportAddress(new
InetSocketTransportAddress(hosts[i], 9300));
}
If I run this I'll see the output
host: host1
host: host2
[... pause for several minutes ...]
And then I'll abort this with Ctrl+C. I've had this run fine
depending on connectivity, but if there's any pause at all it seems to
hang forever. I didn't get anything to improve without modifying the
ScheduledConnectNodesSampler to have an internal timeout
On Thursday, June 9, 2011 at 4:36 AM, Eric Mill wrote:
I wonder if this would explain the strange hanging behavior I've been seeing intermittently when trying to index documents. I can make it go away by blowing away the data repo and starting over, and it may not reoccur for a long time. The next time it happens, I'll start investigating the node endpoints to see if it's happening while a node is dead.
I'm not sure if I'm doing things wrong, but I seem to be
encountering some problems adding registered addresses to a
TransportClient. The ScheduledConnectNodeSampler class inside of
TransportClientNodesService.java would block indefinitely trying to
refresh data from a given node. If that node happened to be dead at
the time, the entire client would come to a halt. I've modified this
to take a timeout in milliseconds to use when retrieving nodesInfo and
gotten this to run more smoothly, but I don't know if this indicates
that I'm doing something wrong. Thanks for your help
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.