thanks for the response.. i should have provided more details - by nothing
in the logs, i mean nothing around the time of hang. i have a monit task to
run the curl based health check every minute, so i know when it hangs.
there are other indicators in the logs - i see long gc times, exceptions..
and occasionally, it would not just hang, but crash fully due to OOM. here
are some of the messages -
[2013-11-01 00:00:12,910][WARN ][monitor.jvm ] [Marrina]
[gc][ParNew][54839][23098] duration [3.7s], collections [2]/[20.7s], total
[3.7s]/[40.7m], memory [7.9gb]->[7.8gb]/[7.9gb], all_pools {[Code Cache]
[12.4mb]->[12.4mb]/[48mb]}{[Par Eden Space]
[3.3mb]->[13.1mb]/[66.5mb]}{[Par Survivor Space] [7mb]->[0b]/[8.3mb]}{[CMS
Old Gen] [7.8gb]->[7.8gb]/[7.9gb]}{[CMS Perm Gen]
[44.7mb]->[44.7mb]/[168mb]}
*
*
[2013-11-01 03:10:33,244][WARN ][search.action ] [Marrina]
Failed tosend release search context
org.elasticsearch.transport.NodeDisconnectedException:
[Star-Lord][inet[/10.6.14.94:9300]][search/freeContext] disconnected
[2013-11-01 03:10:35,293][WARN ][search.action ] [Marrina]
Failed to send release search context
org.elasticsearch.transport.SendRequestTransportException:
[Star-Lord][inet[/10.6.14.94:9300]][search/freeContext]
is it because i have more data and operations than what the nodes i have
can support. its a 3 node cluster (only 2 data nodes though) running on aws
ec2 m1.xlarge with 8GB dedicated heap on each node.
thanks
On Thu, Oct 31, 2013 at 11:27 PM, Ivan Brusic ivan@brusic.com wrote:
Is there truly nothing in the logs? How about frequent garbage
collections? Do you have any monitoring?
Without knowing anything more, I would strongly suggest upgrading to the
latest release. The memory improvements in 0.90.x are truly remarkable
thanks to Lucene 4 and other changes. I am not one to upgrade frequently (I
am still on version 0.90.2), but the Lucene 4 based version is the way to
go.
Cheers,
Ivan
On Thu, Oct 31, 2013 at 11:42 AM, T Vinod Gupta tvinod@readypulse.comwrote:
hi,
i have running cluster of 3 nodes (ES 0.20.0) and once a day or 2, it
hangs for me.. by hang, i mean curl command for health check or get or
search hangs forever. the log files don't have any clue on this.
what can i do to debug this further? will upgrading to lucene 4 based
versions help?
thanks
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.