Connected/Disconnected from node / NodeDisconnectedException

Hi,

I am getting the following messages in the log files of my
application:

2012-04-16 10:39:40,867 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:40,524 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:41,770 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:41,421 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:42,691 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,668 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,833 INFO [org.elasticsearch.client.transport]
[Hamilton, Bart] failed to get node info for [#transport#-1]
[inet[es.vm.local/10.1.60.163:9300]], disconnecting...
org.elasticsearch.transport.NodeDisconnectedException: []
[inet[es.vm.local/10.1.60.163:9300]][cluster/nodes/info] disconnected
2012-04-16 10:54:50,179 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:50,479 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:51,776 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]

This messages appear continuously as I am running load tests. Right
around the time when I get the NodeDisconnectedException, my load test
"stutters" and slow downs significantly. There are no messages of any
kind in ElastcSearch logs.

My environment:

ES: version: 0.19.0, 9 nodes, 19 shards, 0 replicas
We have a load balancer that spreads the traffic from 2 java web apps
between 9 ES nodes.
There are 2 instances of java web app (also load balanced)
Each java web app is configured to hit LB, then LB will direct the
request to a particular ES node
The error printed above is from Java web application log
There are no errors or messages of any kind in ES log
Jmeter test hits LB ip address for java web app, which gets
distributed to 1 out of 2 web apps, then the web app hits LB for
Elastic Search, which distributes it between 9 ES nodes.

Can anyone help me understand why I am getting these messages and if
they are related to significant drop in performance I am experiencing?

Thanks,
Hovanes

PS.

I am not sure if this will help or related, but I also see the
following DEBUG messages when I start my Java webapps.

2012-04-16 10:24:37,231 DEBUG
[netty.channel.socket.nio.NioProviderMetadata] Using the autodetected
NIO constraint level: 0
2012-04-16 10:24:38,100 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:24:38,160 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd23.vm.local]
[2pYxPQHBTtGgC4WGGF7n0A][inet[/10.1.250.25:9300]]]
2012-04-16 10:24:38,165 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd27.vm.local][9nJr6Kg-
R3KtiOcsex4ohA][inet[/10.1.250.47:9300]]]
2012-04-16 10:24:38,170 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd22.vm.local]
[R4Lqab05QdG_VFXUjMIN0w][inet[/10.1.250.10:9300]]]
2012-04-16 10:24:38,176 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd26.vm.local]
[qsebreo1TFiXk57MT2iXnw][inet[/10.1.250.46:9300]]]
2012-04-16 10:24:38,182 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd28.vm.local]
[gZSTYMc5TJ6ckvMAATZt-w][inet[/10.1.250.48:9300]]]
2012-04-16 10:24:38,183 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd21.vm.local]
[teR0loE0Q_GyHVGu0KAvKg][inet[/10.1.250.19:9300]]]
2012-04-16 10:24:38,185 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd29.vm.local][hKMc1ziwT-
GRfDvUsD6i_A][inet[/10.1.250.49:9300]]]
2012-04-16 10:24:38,187 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd24.vm.local]
[BafyzYORQJWEYveFq6iyLw][inet[/10.1.250.26:9300]]]
2012-04-16 10:24:38,189 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd25.vm.local][-
GbQMtzQRFuIDQmdCtT1-w][inet[/10.1.250.45:9300]]]

The 1st "Connected to node" message is the load balancer
([[#transport#-1][inet[es.vm.local/10.1.60.163:9300]]]), the 9 that
follow it are actual ES nodes ( [[webprd21.vm.local][9nJr6Kg-
R3KtiOcsex4ohA])

Are you using sniff mode for the transport client? When you do that, it
will try to directly connect to the nodes and not through the IP address
you provided with the transport client.

On Mon, Apr 16, 2012 at 9:27 PM, Hovanes hovo73@gmail.com wrote:

Hi,

I am getting the following messages in the log files of my
application:

2012-04-16 10:39:40,867 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:40,524 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:41,770 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:41,421 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:42,691 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,668 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,833 INFO [org.elasticsearch.client.transport]
[Hamilton, Bart] failed to get node info for [#transport#-1]
[inet[es.vm.local/10.1.60.163:9300]], disconnecting...
org.elasticsearch.transport.NodeDisconnectedException:
[inet[es.vm.local/10.1.60.163:9300]][cluster/nodes/info] disconnected
2012-04-16 10:54:50,179 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:50,479 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:51,776 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]

This messages appear continuously as I am running load tests. Right
around the time when I get the NodeDisconnectedException, my load test
"stutters" and slow downs significantly. There are no messages of any
kind in ElastcSearch logs.

My environment:

ES: version: 0.19.0, 9 nodes, 19 shards, 0 replicas
We have a load balancer that spreads the traffic from 2 java web apps
between 9 ES nodes.
There are 2 instances of java web app (also load balanced)
Each java web app is configured to hit LB, then LB will direct the
request to a particular ES node
The error printed above is from Java web application log
There are no errors or messages of any kind in ES log
Jmeter test hits LB ip address for java web app, which gets
distributed to 1 out of 2 web apps, then the web app hits LB for
Elastic Search, which distributes it between 9 ES nodes.

Can anyone help me understand why I am getting these messages and if
they are related to significant drop in performance I am experiencing?

Thanks,
Hovanes

PS.

I am not sure if this will help or related, but I also see the
following DEBUG messages when I start my Java webapps.

2012-04-16 10:24:37,231 DEBUG
[netty.channel.socket.nio.NioProviderMetadata] Using the autodetected
NIO constraint level: 0
2012-04-16 10:24:38,100 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:24:38,160 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd23.vm.local]
[2pYxPQHBTtGgC4WGGF7n0A][inet[/10.1.250.25:9300]]]
2012-04-16 10:24:38,165 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd27.vm.local][9nJr6Kg-
R3KtiOcsex4ohA][inet[/10.1.250.47:9300]]]
2012-04-16 10:24:38,170 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd22.vm.local]
[R4Lqab05QdG_VFXUjMIN0w][inet[/10.1.250.10:9300]]]
2012-04-16 10:24:38,176 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd26.vm.local]
[qsebreo1TFiXk57MT2iXnw][inet[/10.1.250.46:9300]]]
2012-04-16 10:24:38,182 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd28.vm.local]
[gZSTYMc5TJ6ckvMAATZt-w][inet[/10.1.250.48:9300]]]
2012-04-16 10:24:38,183 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd21.vm.local]
[teR0loE0Q_GyHVGu0KAvKg][inet[/10.1.250.19:9300]]]
2012-04-16 10:24:38,185 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd29.vm.local][hKMc1ziwT-
GRfDvUsD6i_A][inet[/10.1.250.49:9300]]]
2012-04-16 10:24:38,187 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd24.vm.local]
[BafyzYORQJWEYveFq6iyLw][inet[/10.1.250.26:9300]]]
2012-04-16 10:24:38,189 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd25.vm.local][-
GbQMtzQRFuIDQmdCtT1-w][inet[/10.1.250.45:9300]]]

The 1st "Connected to node" message is the load balancer
([[#transport#-1][inet[es.vm.local/10.1.60.163:9300]]]), the 9 that
follow it are actual ES nodes ( [[webprd21.vm.local][9nJr6Kg-
R3KtiOcsex4ohA])

Yes we are setting client.transport.sniff to true.

Settings s = ImmutableSettings.settingsBuilder()
.put( "cluster.name", <cluster_name> )
.put( "client.transport.sniff", true )
.build();
TransportClient client = new TransportClient( s );
.addTransportAddress(
new InetSocketTransportAddress( <load_balancer>, 9300 ) );

We thought we have to do it since we only add load balancer transport
address (we don't add the actual nodes). Should we not do that in this
scenario?

On Apr 17, 6:49 am, Shay Banon kim...@gmail.com wrote:

Are you using sniff mode for the transport client? When you do that, it
will try to directly connect to the nodes and not through the IP address
you provided with the transport client.

On Mon, Apr 16, 2012 at 9:27 PM, Hovanes hov...@gmail.com wrote:

Hi,

I am getting the following messages in the log files of my
application:

2012-04-16 10:39:40,867 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:40,524 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:41,770 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:41,421 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:42,691 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,668 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,833 INFO [org.elasticsearch.client.transport]
[Hamilton, Bart] failed to get node info for [#transport#-1]
[inet[es.vm.local/10.1.60.163:9300]], disconnecting...
org.elasticsearch.transport.NodeDisconnectedException:
[inet[es.vm.local/10.1.60.163:9300]][cluster/nodes/info] disconnected
2012-04-16 10:54:50,179 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:50,479 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:51,776 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]

This messages appear continuously as I am running load tests. Right
around the time when I get the NodeDisconnectedException, my load test
"stutters" and slow downs significantly. There are no messages of any
kind in ElastcSearch logs.

My environment:

ES: version: 0.19.0, 9 nodes, 19 shards, 0 replicas
We have a load balancer that spreads the traffic from 2 java web apps
between 9 ES nodes.
There are 2 instances of java web app (also load balanced)
Each java web app is configured to hit LB, then LB will direct the
request to a particular ES node
The error printed above is from Java web application log
There are no errors or messages of any kind in ES log
Jmeter test hits LB ip address for java web app, which gets
distributed to 1 out of 2 web apps, then the web app hits LB for
Elastic Search, which distributes it between 9 ES nodes.

Can anyone help me understand why I am getting these messages and if
they are related to significant drop in performance I am experiencing?

Thanks,
Hovanes

PS.

I am not sure if this will help or related, but I also see the
following DEBUG messages when I start my Java webapps.

2012-04-16 10:24:37,231 DEBUG
[netty.channel.socket.nio.NioProviderMetadata] Using the autodetected
NIO constraint level: 0
2012-04-16 10:24:38,100 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:24:38,160 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd23.vm.local]
[2pYxPQHBTtGgC4WGGF7n0A][inet[/10.1.250.25:9300]]]
2012-04-16 10:24:38,165 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd27.vm.local][9nJr6Kg-
R3KtiOcsex4ohA][inet[/10.1.250.47:9300]]]
2012-04-16 10:24:38,170 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd22.vm.local]
[R4Lqab05QdG_VFXUjMIN0w][inet[/10.1.250.10:9300]]]
2012-04-16 10:24:38,176 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd26.vm.local]
[qsebreo1TFiXk57MT2iXnw][inet[/10.1.250.46:9300]]]
2012-04-16 10:24:38,182 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd28.vm.local]
[gZSTYMc5TJ6ckvMAATZt-w][inet[/10.1.250.48:9300]]]
2012-04-16 10:24:38,183 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd21.vm.local]
[teR0loE0Q_GyHVGu0KAvKg][inet[/10.1.250.19:9300]]]
2012-04-16 10:24:38,185 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd29.vm.local][hKMc1ziwT-
GRfDvUsD6i_A][inet[/10.1.250.49:9300]]]
2012-04-16 10:24:38,187 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd24.vm.local]
[BafyzYORQJWEYveFq6iyLw][inet[/10.1.250.26:9300]]]
2012-04-16 10:24:38,189 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd25.vm.local][-
GbQMtzQRFuIDQmdCtT1-w][inet[/10.1.250.45:9300]]]

The 1st "Connected to node" message is the load balancer
([[#transport#-1][inet[es.vm.local/10.1.60.163:9300]]]), the 9 that
follow it are actual ES nodes ( [[webprd21.vm.local][9nJr6Kg-
R3KtiOcsex4ohA])

If you ahve a load balancer, and the client can't connect directly to the
actual cluster nodes, then sniff won't really help... . If the client can
only connect through the loadbalancer to the cluster nodes, then it should
only use that address without sniff.

On Tue, Apr 17, 2012 at 8:52 PM, Hovanes hovo73@gmail.com wrote:

Yes we are setting client.transport.sniff to true.

Settings s = ImmutableSettings.settingsBuilder()
.put( "cluster.name", <cluster_name> )
.put( "client.transport.sniff", true )
.build();
TransportClient client = new TransportClient( s );
.addTransportAddress(
new InetSocketTransportAddress( <load_balancer>, 9300 ) );

We thought we have to do it since we only add load balancer transport
address (we don't add the actual nodes). Should we not do that in this
scenario?

On Apr 17, 6:49 am, Shay Banon kim...@gmail.com wrote:

Are you using sniff mode for the transport client? When you do that, it
will try to directly connect to the nodes and not through the IP address
you provided with the transport client.

On Mon, Apr 16, 2012 at 9:27 PM, Hovanes hov...@gmail.com wrote:

Hi,

I am getting the following messages in the log files of my
application:

2012-04-16 10:39:40,867 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:40,524 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:41,770 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:41,421 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:42,691 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,668 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,833 INFO [org.elasticsearch.client.transport]
[Hamilton, Bart] failed to get node info for [#transport#-1]
[inet[es.vm.local/10.1.60.163:9300]], disconnecting...
org.elasticsearch.transport.NodeDisconnectedException:
[inet[es.vm.local/10.1.60.163:9300]][cluster/nodes/info] disconnected
2012-04-16 10:54:50,179 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:50,479 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:51,776 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]

This messages appear continuously as I am running load tests. Right
around the time when I get the NodeDisconnectedException, my load test
"stutters" and slow downs significantly. There are no messages of any
kind in ElastcSearch logs.

My environment:

ES: version: 0.19.0, 9 nodes, 19 shards, 0 replicas
We have a load balancer that spreads the traffic from 2 java web apps
between 9 ES nodes.
There are 2 instances of java web app (also load balanced)
Each java web app is configured to hit LB, then LB will direct the
request to a particular ES node
The error printed above is from Java web application log
There are no errors or messages of any kind in ES log
Jmeter test hits LB ip address for java web app, which gets
distributed to 1 out of 2 web apps, then the web app hits LB for
Elastic Search, which distributes it between 9 ES nodes.

Can anyone help me understand why I am getting these messages and if
they are related to significant drop in performance I am experiencing?

Thanks,
Hovanes

PS.

I am not sure if this will help or related, but I also see the
following DEBUG messages when I start my Java webapps.

2012-04-16 10:24:37,231 DEBUG
[netty.channel.socket.nio.NioProviderMetadata] Using the autodetected
NIO constraint level: 0
2012-04-16 10:24:38,100 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:24:38,160 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd23.vm.local]
[2pYxPQHBTtGgC4WGGF7n0A][inet[/10.1.250.25:9300]]]
2012-04-16 10:24:38,165 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd27.vm.local][9nJr6Kg-
R3KtiOcsex4ohA][inet[/10.1.250.47:9300]]]
2012-04-16 10:24:38,170 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd22.vm.local]
[R4Lqab05QdG_VFXUjMIN0w][inet[/10.1.250.10:9300]]]
2012-04-16 10:24:38,176 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd26.vm.local]
[qsebreo1TFiXk57MT2iXnw][inet[/10.1.250.46:9300]]]
2012-04-16 10:24:38,182 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd28.vm.local]
[gZSTYMc5TJ6ckvMAATZt-w][inet[/10.1.250.48:9300]]]
2012-04-16 10:24:38,183 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd21.vm.local]
[teR0loE0Q_GyHVGu0KAvKg][inet[/10.1.250.19:9300]]]
2012-04-16 10:24:38,185 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd29.vm.local][hKMc1ziwT-
GRfDvUsD6i_A][inet[/10.1.250.49:9300]]]
2012-04-16 10:24:38,187 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd24.vm.local]
[BafyzYORQJWEYveFq6iyLw][inet[/10.1.250.26:9300]]]
2012-04-16 10:24:38,189 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd25.vm.local][-
GbQMtzQRFuIDQmdCtT1-w][inet[/10.1.250.45:9300]]]

The 1st "Connected to node" message is the load balancer
([[#transport#-1][inet[es.vm.local/10.1.60.163:9300]]]), the 9 that
follow it are actual ES nodes ( [[webprd21.vm.local][9nJr6Kg-
R3KtiOcsex4ohA])

Client can connect to the actual nodes, in the above described
scenario when we start our app, the actual nodes get discovered.
However for some reason under load, we keep getting Node Disconnected/
Connected messages, and the Throughput goes down sharply. Can you help
me understand 2 things:

  1. What is the pattern of creating/using Client. Right now we create a
    Client when the program is started, and we always use that client. We
    never close it. Is this acceptable or should we create a client for
    every request.

  2. When would one use a Transport Client vs Node Client? Does one
    offer any advantages/disadvantages over the other. Right now we are
    using Transport, will switching to Node offer better performance?

Thanks,
Hovanes

On Apr 19, 6:33 am, Shay Banon kim...@gmail.com wrote:

If you ahve a load balancer, and the client can't connect directly to the
actual cluster nodes, then sniff won't really help... . If the client can
only connect through the loadbalancer to the cluster nodes, then it should
only use that address without sniff.

On Tue, Apr 17, 2012 at 8:52 PM, Hovanes hov...@gmail.com wrote:

Yes we are setting client.transport.sniff to true.

Settings s = ImmutableSettings.settingsBuilder()
.put( "cluster.name", <cluster_name> )
.put( "client.transport.sniff", true )
.build();
TransportClient client = new TransportClient( s );
.addTransportAddress(
new InetSocketTransportAddress( <load_balancer>, 9300 ) );

We thought we have to do it since we only add load balancer transport
address (we don't add the actual nodes). Should we not do that in this
scenario?

On Apr 17, 6:49 am, Shay Banon kim...@gmail.com wrote:

Are you using sniff mode for the transport client? When you do that, it
will try to directly connect to the nodes and not through the IP address
you provided with the transport client.

On Mon, Apr 16, 2012 at 9:27 PM, Hovanes hov...@gmail.com wrote:

Hi,

I am getting the following messages in the log files of my
application:

2012-04-16 10:39:40,867 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:40,524 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:41,770 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:41,421 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:42,691 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,668 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,833 INFO [org.elasticsearch.client.transport]
[Hamilton, Bart] failed to get node info for [#transport#-1]
[inet[es.vm.local/10.1.60.163:9300]], disconnecting...
org.elasticsearch.transport.NodeDisconnectedException:
[inet[es.vm.local/10.1.60.163:9300]][cluster/nodes/info] disconnected
2012-04-16 10:54:50,179 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:50,479 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:51,776 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]

This messages appear continuously as I am running load tests. Right
around the time when I get the NodeDisconnectedException, my load test
"stutters" and slow downs significantly. There are no messages of any
kind in ElastcSearch logs.

My environment:

ES: version: 0.19.0, 9 nodes, 19 shards, 0 replicas
We have a load balancer that spreads the traffic from 2 java web apps
between 9 ES nodes.
There are 2 instances of java web app (also load balanced)
Each java web app is configured to hit LB, then LB will direct the
request to a particular ES node
The error printed above is from Java web application log
There are no errors or messages of any kind in ES log
Jmeter test hits LB ip address for java web app, which gets
distributed to 1 out of 2 web apps, then the web app hits LB for
Elastic Search, which distributes it between 9 ES nodes.

Can anyone help me understand why I am getting these messages and if
they are related to significant drop in performance I am experiencing?

Thanks,
Hovanes

PS.

I am not sure if this will help or related, but I also see the
following DEBUG messages when I start my Java webapps.

2012-04-16 10:24:37,231 DEBUG
[netty.channel.socket.nio.NioProviderMetadata] Using the autodetected
NIO constraint level: 0
2012-04-16 10:24:38,100 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:24:38,160 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd23.vm.local]
[2pYxPQHBTtGgC4WGGF7n0A][inet[/10.1.250.25:9300]]]
2012-04-16 10:24:38,165 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd27.vm.local][9nJr6Kg-
R3KtiOcsex4ohA][inet[/10.1.250.47:9300]]]
2012-04-16 10:24:38,170 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd22.vm.local]
[R4Lqab05QdG_VFXUjMIN0w][inet[/10.1.250.10:9300]]]
2012-04-16 10:24:38,176 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd26.vm.local]
[qsebreo1TFiXk57MT2iXnw][inet[/10.1.250.46:9300]]]
2012-04-16 10:24:38,182 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd28.vm.local]
[gZSTYMc5TJ6ckvMAATZt-w][inet[/10.1.250.48:9300]]]
2012-04-16 10:24:38,183 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd21.vm.local]
[teR0loE0Q_GyHVGu0KAvKg][inet[/10.1.250.19:9300]]]
2012-04-16 10:24:38,185 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd29.vm.local][hKMc1ziwT-
GRfDvUsD6i_A][inet[/10.1.250.49:9300]]]
2012-04-16 10:24:38,187 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd24.vm.local]
[BafyzYORQJWEYveFq6iyLw][inet[/10.1.250.26:9300]]]
2012-04-16 10:24:38,189 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd25.vm.local][-
GbQMtzQRFuIDQmdCtT1-w][inet[/10.1.250.45:9300]]]

The 1st "Connected to node" message is the load balancer
([[#transport#-1][inet[es.vm.local/10.1.60.163:9300]]]), the 9 that
follow it are actual ES nodes ( [[webprd21.vm.local][9nJr6Kg-
R3KtiOcsex4ohA])

If you can turn on logging on the transport client, you should be able to
see why its disconnecting.

On Fri, Apr 20, 2012 at 2:00 AM, Hovanes hovo73@gmail.com wrote:

Client can connect to the actual nodes, in the above described
scenario when we start our app, the actual nodes get discovered.
However for some reason under load, we keep getting Node Disconnected/
Connected messages, and the Throughput goes down sharply. Can you help
me understand 2 things:

  1. What is the pattern of creating/using Client. Right now we create a
    Client when the program is started, and we always use that client. We
    never close it. Is this acceptable or should we create a client for
    every request.

Thats the way that you should use it.

  1. When would one use a Transport Client vs Node Client? Does one
    offer any advantages/disadvantages over the other. Right now we are
    using Transport, will switching to Node offer better performance?

Node client joins the cluster and is aware of the distribution of data
across it, transport client delegates that to another node in the cluster
itself.

Thanks,
Hovanes

On Apr 19, 6:33 am, Shay Banon kim...@gmail.com wrote:

If you ahve a load balancer, and the client can't connect directly to the
actual cluster nodes, then sniff won't really help... . If the client can
only connect through the loadbalancer to the cluster nodes, then it
should
only use that address without sniff.

On Tue, Apr 17, 2012 at 8:52 PM, Hovanes hov...@gmail.com wrote:

Yes we are setting client.transport.sniff to true.

Settings s = ImmutableSettings.settingsBuilder()
.put( "cluster.name", <cluster_name> )
.put( "client.transport.sniff", true )
.build();
TransportClient client = new TransportClient( s );
.addTransportAddress(
new InetSocketTransportAddress( <load_balancer>, 9300 )
);

We thought we have to do it since we only add load balancer transport
address (we don't add the actual nodes). Should we not do that in this
scenario?

On Apr 17, 6:49 am, Shay Banon kim...@gmail.com wrote:

Are you using sniff mode for the transport client? When you do that,
it
will try to directly connect to the nodes and not through the IP
address
you provided with the transport client.

On Mon, Apr 16, 2012 at 9:27 PM, Hovanes hov...@gmail.com wrote:

Hi,

I am getting the following messages in the log files of my
application:

2012-04-16 10:39:40,867 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:40,524 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:44:41,770 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:41,421 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:49:42,691 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,668 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:54:44,833 INFO [org.elasticsearch.client.transport]
[Hamilton, Bart] failed to get node info for [#transport#-1]
[inet[es.vm.local/10.1.60.163:9300]], disconnecting...
org.elasticsearch.transport.NodeDisconnectedException:
[inet[es.vm.local/10.1.60.163:9300]][cluster/nodes/info]
disconnected
2012-04-16 10:54:50,179 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:50,479 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Disconnected from
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:59:51,776 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]

This messages appear continuously as I am running load tests. Right
around the time when I get the NodeDisconnectedException, my load
test
"stutters" and slow downs significantly. There are no messages of
any
kind in ElastcSearch logs.

My environment:

ES: version: 0.19.0, 9 nodes, 19 shards, 0 replicas
We have a load balancer that spreads the traffic from 2 java web
apps
between 9 ES nodes.
There are 2 instances of java web app (also load balanced)
Each java web app is configured to hit LB, then LB will direct the
request to a particular ES node
The error printed above is from Java web application log
There are no errors or messages of any kind in ES log
Jmeter test hits LB ip address for java web app, which gets
distributed to 1 out of 2 web apps, then the web app hits LB for
Elastic Search, which distributes it between 9 ES nodes.

Can anyone help me understand why I am getting these messages and
if
they are related to significant drop in performance I am
experiencing?

Thanks,
Hovanes

PS.

I am not sure if this will help or related, but I also see the
following DEBUG messages when I start my Java webapps.

2012-04-16 10:24:37,231 DEBUG
[netty.channel.socket.nio.NioProviderMetadata] Using the
autodetected
NIO constraint level: 0
2012-04-16 10:24:38,100 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node
[[#transport#-1][inet[es.vm.local/
10.1.60.163:9300]]]
2012-04-16 10:24:38,160 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd23.vm.local]
[2pYxPQHBTtGgC4WGGF7n0A][inet[/10.1.250.25:9300]]]
2012-04-16 10:24:38,165 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd27.vm.local][9nJr6Kg-
R3KtiOcsex4ohA][inet[/10.1.250.47:9300]]]
2012-04-16 10:24:38,170 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd22.vm.local]
[R4Lqab05QdG_VFXUjMIN0w][inet[/10.1.250.10:9300]]]
2012-04-16 10:24:38,176 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd26.vm.local]
[qsebreo1TFiXk57MT2iXnw][inet[/10.1.250.46:9300]]]
2012-04-16 10:24:38,182 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd28.vm.local]
[gZSTYMc5TJ6ckvMAATZt-w][inet[/10.1.250.48:9300]]]
2012-04-16 10:24:38,183 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd21.vm.local]
[teR0loE0Q_GyHVGu0KAvKg][inet[/10.1.250.19:9300]]]
2012-04-16 10:24:38,185 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd29.vm.local][hKMc1ziwT-
GRfDvUsD6i_A][inet[/10.1.250.49:9300]]]
2012-04-16 10:24:38,187 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd24.vm.local]
[BafyzYORQJWEYveFq6iyLw][inet[/10.1.250.26:9300]]]
2012-04-16 10:24:38,189 DEBUG [org.elasticsearch.transport.netty]
[Hamilton, Bart] Connected to node [[webprd25.vm.local][-
GbQMtzQRFuIDQmdCtT1-w][inet[/10.1.250.45:9300]]]

The 1st "Connected to node" message is the load balancer
([[#transport#-1][inet[es.vm.local/10.1.60.163:9300]]]), the 9
that
follow it are actual ES nodes ( [[webprd21.vm.local][9nJr6Kg-
R3KtiOcsex4ohA])