Just to report out tests back to the list, in case somebody else hits this.
If you have time, I would love to nail it down. Ping me on IRC, maybe I
can provide custom builds to test it out.
On Friday, May 20, 2011 at 3:25 AM, Sebastian Gavarini wrote:
Shay,
If you want to debug this on your side I can provide the JVM I am using,
but I think I am going to switch to SUN's, it's the second weird case I've
found (the last one was an infinite loop while preparing an Exception, that
you fixed) and I'd rather be on a safer bet.
Thanks,
Sebastian.
On Thu, May 19, 2011 at 8:45 PM, Sebastian Gavarini sgavarini@gmail.comwrote:
Just for the record, I tried the same test with SUN's JVM 1.6 and I
couldn't reproduce it.
On Thu, May 19, 2011 at 8:07 PM, Sebastian Gavarini sgavarini@gmail.comwrote:
Shay,
I pruned a couple of things, now I can reproduce the problem with an almost
default ES, I just add the p.json mapping you can find in the gist.
The procedure would be:
- install default ES
- copy p.json to ES_HOME/config/mappings/_default
- run ES
- run the "setup" curl commands also in the gist (don't know if they do
anything meaningful for the test apart of creating the index "index1")
- create the SearchLoopExample java file
- run SearchLoopExample
Hopefully you should see the nasty Exception. Keep in mind I am using
JRockit 64 bits for Linux, it could be that too.
Loop causing the exception · GitHub
Sebastian.
On Thu, May 19, 2011 at 5:40 PM, Shay Banon shay.banon@elasticsearch.comwrote:
Yea, then strange that we don't see a failure in the logs...
On Thursday, May 19, 2011 at 11:38 PM, Sebastian Gavarini wrote:
Shay,
I haven't changed anything in the logging, the setting is still as default:
rootLogger: INFO, console, file
logger:
log action execution errors for easier debugging
action: DEBUG
reduce the logging for aws, too much is logged under the default INFO
com.amazonaws: WARN
...
is this what you mean by setting action to DEBUG? If this is the case then
the log isn't showing anyway.
I'll try to simplify the test to recreate the problem now.
Thanks,
Sebastian.
On Wed, May 18, 2011 at 6:11 PM, Shay Banon shay.banon@elasticsearch.comwrote:
Thanks for the effort. Did you change anything in the logging? Such a
failure (on the shard level) should be logged in DEBUG level under action
logger name.
On Thursday, May 19, 2011 at 12:08 AM, Sebastian Gavarini wrote:
I have tried it with 0.16.1 and I see again an Exception. The command line
I am using is "./elasticsearch -f" in "bin"
No specific log anywhere, meaning my Java example, ES command line shell,
or ES log file.
I will try to simplify the code so you can reproduce it.
------------- Response: -------------
Exception in thread "Main Thread" java.util.concurrent.ExecutionException:
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to
execute phase [query_fetch], total failure; shardFailures
{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-servicios-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-servicios-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-empleos-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-empleos-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-productos-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-productos-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-vehiculos-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-vehiculos-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-inmuebles-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-inmuebles-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested: }
at
org.elasticsearch.common.util.concurrent.AbstractFuture$Sync.complete(AbstractFuture.java:320)
at
org.elasticsearch.common.util.concurrent.AbstractFuture$Sync.setException(AbstractFuture.java:297)
at
org.elasticsearch.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:129)
at
org.elasticsearch.action.support.AdapterActionFuture.onFailure(AdapterActionFuture.java:87)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:248)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.access$400(TransportSearchTypeAction.java:75)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onFailure(TransportSearchTypeAction.java:198)
at
org.elasticsearch.search.action.SearchServiceTransportAction$5.handleException(SearchServiceTransportAction.java:241)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleException(MessageChannelHandler.java:156)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(MessageChannelHandler.java:147)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:99)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:754)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:317)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:299)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:214)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:545)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:540)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.elasticsearch.action.search.SearchPhaseExecutionException:
Failed to execute phase [query_fetch], total failure; shardFailures
{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-servicios-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-servicios-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-empleos-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-empleos-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-productos-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-productos-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-vehiculos-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-vehiculos-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested:
}{[GhCJkjAoSe2Xc_SBliG7sw][index-main-dev_ar-inmuebles-v0][0]:
RemoteTransportException[[Assassin][inet[/10.0.0.1:5000]][search/phase/query+fetch]];
nested: QueryPhaseExecutionException[[index-main-dev_ar-inmuebles-v0][0]:
query[filtered(random:[0.0 TO
1.0])->FilterCacheFilterWrapper(productos_tipo1:fruta)],from[0],size[100]:
Query Failed [Failed to execute main query]]; nested: }
... 27 more
On Wed, May 18, 2011 at 5:36 PM, Shay Banon shay.banon@elasticsearch.comwrote:
Yes, simplifying things a bit will help, but first, can you give 0.16.1 a
go?
Are you running the server using the script as a standalone process? You
should see those failure logged (some logging was missing that was fixed in
0.16.1).
On Wednesday, May 18, 2011 at 11:34 PM, Sebastian Gavarini wrote:
0.16.0, forgot to tell.
If you want I could try to narrow a bit further the Java example, like
getting rid of alias, and trying to use a stock configuration, right now I
am using a lot of customized parameters in elasticsearch.yml. Most of them
are for indexing and analyzing anyway, and not searching, in this case the
search returns an empty result, but let me know if I can help.
On Wed, May 18, 2011 at 5:13 PM, Shay Banon shay.banon@elasticsearch.comwrote:
Which version are you using? Forgot to ask.
On Wednesday, May 18, 2011 at 11:09 PM, Sebastian Gavarini wrote:
Hi Shay,
Unfortunately I don't see stack traces, I added the server.log to the
previous gist, just some GC activity and the disconnect notification.
I could change some log parameters to debug if you hint me.
Thanks,
Sebastian.
On Wed, May 18, 2011 at 4:31 PM, Shay Banon shay.banon@elasticsearch.comwrote:
There should be a failure logged on the server side with a stack trace, can
you gist it?
On Wednesday, May 18, 2011 at 9:41 PM, Sebastian wrote:
Hi all,
I am having a problem with an exception being thrown in ES. It all
started with a loop that tries to get a fixed size page of data
(document titles) from an index, but in my case there isn't enough
data on the index so it keeps trying to refill an internal buffer,
launching a new search immediately after the previous one completed.
So far so good, my fault with the sizes, it should be a pointless CPU
hog, but instead after a couple of seconds it throws an Exception. I
simplified the code to this gist:
Loop causing the exception · GitHub
I managed to reproduce the exceptions with that code and a couple of
seconds of wait, a bit of context I am using JRockit:
java version "1.6.0_01"
Java(TM) SE Runtime Environment (build 1.6.0_01-b06)
BEA JRockit(R) (build R27.3.1-1-85830-1.6.0_01-20070716-1248-linux-
x86_64, compiled mode)
Two cores desktop 64 bits Linux. I am using an alias, consisting of 5
indices. The termFilter used does not match any document. I was
wandering if it could be some sort of bug. Any help would be welcomed.
Thanks,
Sebastian.