All threads hanged and not responding anymore

Elasticsearch version:
2.4.1
Plugins installed: []

JVM version:
1.8
OS version:
Linux
Description of the problem including expected versus actual behavior:
I am running the es searh queries using transport client. After some time am running into a state where there are no response, i have taken the thread dump and i see all threads are parked waiting for a signal
Steps to reproduce:
1.code is as below
SearchRequestBuilder setQuery = dataStoreConnection.getTransportClient().prepareSearch().setIndices(indexName) .setTypes(mappingName).setQuery(filterBuilder).setSize(0); if (null != buildMetricsQuery) { for (AbstractAggregationBuilder metricsAggregationBuilder : buildMetricsQuery) { setQuery.addAggregation(metricsAggregationBuilder); } } logger.info("Executing query: " + setQuery); SearchResponse searchResponse = setQuery.get(); logger.info("Response obtained is: " + searchResponse);
2.
Thread dump has below parked threads
"pool-1-thread-7" #31 prio=5 os_prio=0 tid=0x00007f75f80d2000 nid=0xbfb waiting on condition [0x00007f75da146000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000dad6dd18> (a org.elasticsearch.common.util.concurrent.BaseFuture$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:280) at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:120) at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:42) at org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:64) at net.appcito.reporting.engine.dao.DaoImpl.search(DaoImpl.java:43) at net.appcito.reporting.engine.repository.RepositoryImpl.handleBasedOnContext(RepositoryImpl.java:66) at net.appcito.reporting.engine.repository.RepositoryImpl.query(RepositoryImpl.java:50) at net.appcito.reporting.engine.service.ReportingEngineService.query(ReportingEngineService.java:97) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:116) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:963) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:897) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)

There can be many reasons why the search api call is waiting on your side. Perhaps a searches are taking a long time to execute on the ES side.

How many requests are sending concurrently?
How many nodes do you have in your cluster?
Can you share the output of the following api calls:

  • GET host/_cat/thread_pool
  • GET host/_nodes/hotthreads

I am sending around 20 requests concurrently. There is only one node(it's my dev env).
My question is:
Am doing blocked get at client side, yes i dont get response from ES and so my threads will beblocked forever and no more threads to handle upcoming requests.

But at ES side what is happening? after some times i should get at least failed response(My expectation) or is it like ES just drops the requests if its overloaded?

If i restart my client i will be able to obtain response for some time and again gets stuck.

Below are the requested outputs.
10.0.0.17 10.0.0.17 0 0 0 0 0 0 0 0 0
10.0.0.14 10.0.0.14 0 0 0 0 0 0 0 0 40

  ::: {appcito}{OD003IEQTlaKiAIxMHvqlQ}{10.0.0.14}{10.0.0.14:9300}{max_local_storage_nodes=1, master=false}
   Hot threads at 2017-03-31T03:41:00.348Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
    0.0% (133.7micros out of 500ms) cpu usage by thread 'elasticsearch[appcito][transport_client_timer][T#1]{Hashed wheel timer #1}'
     10/10 snapshots sharing following 5 elements
       java.lang.Thread.sleep(Native Method)
       org.jboss.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:445)
       org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:364)
       org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
       java.lang.Thread.run(Thread.java:745)

::: {appcito}{_uOpkRluQ7yo1dMR5Ulj_g}{10.0.0.17}{10.0.0.17:9300}{data=false, max_local_storage_nodes=1, master=true}
   Hot threads at 2017-03-31T03:41:00.349Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
    0.0% (145.8micros out of 500ms) cpu usage by thread 'elasticsearch[appcito][transport_client_timer][T#1]{Hashed wheel timer #1}'
     10/10 snapshots sharing following 5 elements
       java.lang.Thread.sleep(Native Method)
       org.jboss.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:445)
       org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:364)
       org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
       java.lang.Thread.run(Thread.java:745)

I am sending around 20 requests concurrently. There is only one node(it's my dev env).
My question is:
Am doing blocked get at client side, yes i dont get response from ES and so my threads will beblocked forever and no more threads to handle upcoming requests.

But at ES side what is happening? after some times i should get at least failed response(My expectation) or is it like ES just drops the requests if its overloaded?

If i restart my client i will be able to obtain response for some time and again gets stuck.

Below are the requested outputs.
10.0.0.17 10.0.0.17 0 0 0 0 0 0 0 0 0
10.0.0.14 10.0.0.14 0 0 0 0 0 0 0 0 40

::: {appcito}{OD003IEQTlaKiAIxMHvqlQ}{10.0.0.14}{10.0.0.14:9300}{max_local_storage_nodes=1, master=false}
Hot threads at 2017-03-31T03:41:00.348Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

0.0% (133.7micros out of 500ms) cpu usage by thread 'elasticsearch[appcito][transport_client_timer][T#1]{Hashed wheel timer #1}'
 10/10 snapshots sharing following 5 elements
   java.lang.Thread.sleep(Native Method)
   org.jboss.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:445)
   org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:364)
   org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   java.lang.Thread.run(Thread.java:745)

::: {appcito}{_uOpkRluQ7yo1dMR5Ulj_g}{10.0.0.17}{10.0.0.17:9300}{data=false, max_local_storage_nodes=1, master=true}
Hot threads at 2017-03-31T03:41:00.349Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

0.0% (145.8micros out of 500ms) cpu usage by thread 'elasticsearch[appcito][transport_client_timer][T#1]{Hashed wheel timer #1}'
 10/10 snapshots sharing following 5 elements
   java.lang.Thread.sleep(Native Method)
   org.jboss.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:445)
   org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:364)
   org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   java.lang.Thread.run(Thread.java:745)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.