Message not fully read (request) for [] and action [] log messages since upgrade to 19.0


(Owen Butler-2) #1

On a few instances, I'm seeing repeated log messages in the form:

[2012-03-06 16:57:40,079][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38951] and
action [], resetting
[2012-03-06 16:57:42,431][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147443] and
action [], resetting
[2012-03-06 16:57:42,748][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146541] and
action [], resetting
[2012-03-06 16:57:44,980][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36339] and
action [], resetting
[2012-03-06 16:57:45,080][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38953] and
action [], resetting
[2012-03-06 16:57:47,434][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147445] and
action [], resetting
[2012-03-06 16:57:47,772][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146543] and
action [], resetting
[2012-03-06 16:57:50,013][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36341] and
action [], resetting

Each node in the cluster logs similar (but not exactly the same
number) messages.

I don't see this in instances that have been "nuked" while upgrading,
IE, testing instances where the data directory has been cleared while
upgrading. Only on instances that have seen an upgrade from 18.7.

O


(Owen Butler-2) #2

I don't see this in instances that have been "nuked" while upgrading,
IE, testing instances where the data directory has been cleared while
upgrading. Only on instances that have seen an upgrade from 18.7.

Actually, cancel that. I now see it in an instance where the data
directory had been emptied during the downtime. Must be some other factor.


(Shay Banon) #3

Do you have clients / nodes that run 0.18 somewhere?

On Tuesday, March 6, 2012 at 8:01 AM, Owen Butler wrote:

On a few instances, I'm seeing repeated log messages in the form:

[2012-03-06 16:57:40,079][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38951] and
action [], resetting
[2012-03-06 16:57:42,431][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147443] and
action [], resetting
[2012-03-06 16:57:42,748][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146541] and
action [], resetting
[2012-03-06 16:57:44,980][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36339] and
action [], resetting
[2012-03-06 16:57:45,080][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38953] and
action [], resetting
[2012-03-06 16:57:47,434][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147445] and
action [], resetting
[2012-03-06 16:57:47,772][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146543] and
action [], resetting
[2012-03-06 16:57:50,013][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36341] and
action [], resetting

Each node in the cluster logs similar (but not exactly the same
number) messages.

I don't see this in instances that have been "nuked" while upgrading,
IE, testing instances where the data directory has been cleared while
upgrading. Only on instances that have seen an upgrade from 18.7.

O


(Owen Butler-2) #4

Almost certainly other nodes on the network running 18, or even 17.9.
Equally, they would be configured with unique cluster names so wouldn't be
trying to join the cluster, but perhaps pinging about to autodiscover?

Now that you mention it, the rate of messages on the first instance I saw
this on has slowed over time (perhaps as I go through and upgrade the other
small clusters on the same network).

Additionally, the cluster where I did not see this, definitely doesn't
have any other older nodes on the network (on it's own network).

On Tuesday, March 6, 2012 5:32:01 PM UTC+11, kimchy wrote:

Do you have clients / nodes that run 0.18 somewhere?

On Tuesday, March 6, 2012 at 8:01 AM, Owen Butler wrote:

On a few instances, I'm seeing repeated log messages in the form:

[2012-03-06 16:57:40,079][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38951] and
action [], resetting
[2012-03-06 16:57:42,431][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147443] and
action [], resetting
[2012-03-06 16:57:42,748][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146541] and
action [], resetting
[2012-03-06 16:57:44,980][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36339] and
action [], resetting
[2012-03-06 16:57:45,080][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38953] and
action [], resetting
[2012-03-06 16:57:47,434][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147445] and
action [], resetting
[2012-03-06 16:57:47,772][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146543] and
action [], resetting
[2012-03-06 16:57:50,013][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36341] and
action [], resetting

Each node in the cluster logs similar (but not exactly the same
number) messages.

I don't see this in instances that have been "nuked" while upgrading,
IE, testing instances where the data directory has been cleared while
upgrading. Only on instances that have seen an upgrade from 18.7.

O


(Shay Banon) #5

Yes, the pinging method (if you use multicast) might cause this.

On Tuesday, March 6, 2012 at 8:55 AM, Owen Butler wrote:

Almost certainly other nodes on the network running 18, or even 17.9. Equally, they would be configured with unique cluster names so wouldn't be trying to join the cluster, but perhaps pinging about to autodiscover?

Now that you mention it, the rate of messages on the first instance I saw this on has slowed over time (perhaps as I go through and upgrade the other small clusters on the same network).

Additionally, the cluster where I did not see this, definitely doesn't have any other older nodes on the network (on it's own network).

On Tuesday, March 6, 2012 5:32:01 PM UTC+11, kimchy wrote:

Do you have clients / nodes that run 0.18 somewhere?

On Tuesday, March 6, 2012 at 8:01 AM, Owen Butler wrote:

On a few instances, I'm seeing repeated log messages in the form:

[2012-03-06 16:57:40,079][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38951] and
action [], resetting
[2012-03-06 16:57:42,431][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147443] and
action [], resetting
[2012-03-06 16:57:42,748][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146541] and
action [], resetting
[2012-03-06 16:57:44,980][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36339] and
action [], resetting
[2012-03-06 16:57:45,080][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [38953] and
action [], resetting
[2012-03-06 16:57:47,434][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [147445] and
action [], resetting
[2012-03-06 16:57:47,772][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [146543] and
action [], resetting
[2012-03-06 16:57:50,013][WARN ][transport.netty ]
[index2.qa.acx] Message not fully read (request) for [36341] and
action [], resetting

Each node in the cluster logs similar (but not exactly the same
number) messages.

I don't see this in instances that have been "nuked" while upgrading,
IE, testing instances where the data directory has been cleared while
upgrading. Only on instances that have seen an upgrade from 18.7.

O


(system) #6