ElasticSearch all Search Threads are waiting on take() method

Sekhar1 · November 8, 2018, 3:13pm

Hi Team,

ElasticSearch : 6.0.0
Operating System : CentOS 7
JVM arguments [-Xms4g, -Xmx4g, -XX:MaxMetaspaceSize=512m, -XX:+UseParNewGC, -XX:NewSize=128m, -XX:MaxNewSize=128m, -XX:SurvivorRatio=1024, -XX:MaxGCPauseMillis=1000, -XX:+UseCompressedOops, -XX:ParallelGCThreads=8, -XX:ConcGCThreads=8, -XX:+DisableExplicitGC, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=50, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/home/somasekhar.k/softwares/ES6/logs/ESHeapDump.hprof, -XX:+PrintGCDetails, -XX:+PrintGCTimeStamps, -Xloggc:/home/somasekhar.k/softwares/ES6/logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=128M, -Des.path.home=/home/somasekhar.k/softwares/ES6, -Des.path.conf=/home/somasekhar.k/softwares/ES6/config

Index Mapping Details :
{"2_1024": {"aliases": {},"mappings": {"2_1024": {"properties": {1: {"type": "text"},
2: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
4: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
5: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
6: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
7: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
8: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
9: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
10: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
11: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
12: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
13: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
14: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
15: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
16: {"type": "text","fields": {"exactMatch": {"type": "keyword"},"phoneticMatch": {"type": "text","analyzer": "phonetic_analyzer"},"subStringMatch": {"type": "text","analyzer": "substring_analyzer"}}},
"@timestamp": {"type": "date"},"@version": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}},
"query": {"properties": {"term": {"properties": {7: {"properties": {"exactMatch": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}}}}},"size": {"type": "long"},"type": {"type": "text","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}},"settings": {"index": {"number_of_shards": "1","provided_name": "2_1024","creation_date": "1541681596965","analysis": {"filter": {"phonetic_filter": {"type": "phonetic","encoder": "double_metaphone"}},"analyzer": {"phonetic_analyzer": {"filter": ["phonetic_filter","lowercase"],"tokenizer": "standard"},"substring_analyzer": {"filter": "lowercase","tokenizer": "substring_tokenizer"}},"tokenizer": {"substring_tokenizer": {"token_chars": ["letter","digit"],"min_gram": "3","type": "ngram","max_gram": "3"}}},"number_of_replicas": "0","uuid": "aFpNgjSXQA2eaepVYbjylg","version": {"created": "6000099"}}}}}

All search threads(13) are waiting same place.I taken jstack multiple times but every time showing same place.Please refer below Jstack snapshot

No of Nodes : 1
No of Index in ES : 2
Each Index Size : 1GB
No of Shards for Index : 1
No of Document in each Index : 1Million

No of search requests 1 Million using Low level rest client and doing exact match
Query : {"size":0,"_source":false,"query":{"term":{"7.exactMatch":"Subex__www200"}}}

I can't add more nodes to cluster.I increased no of shards to 2,3 and 5 but no improve on speed wise.

I checked Disk I/O statistics also but IOWaiting is o.13% only
Network Speed : 100M/sec
I tried Shard Request Cache,Indices queries Cache,Decreasing Semants from 15 to 1 and Pre-loading data into the file system cache but no improvement on search speed.

I need to know why search threads waiting same place and how to overcome this problem
I am try achieve latency and throughput using single client and single node.

DavidTurner · November 8, 2018, 6:14pm

Please don't post images of text, it makes them hard to read, impossible to search, and makes it hard for those of us using screenreaders.

A thread that's blocked on sun.misc.Unsafe.park like this is idle, waiting for some work to do.

Sekhar1 · November 8, 2018, 6:27pm

Thanks for replay.

Try to upload jstack log file but its not accept .log format.

Currently I am process 5555 request per sec for basic query but my expectations 10000 request per sec.
Can you please explain which work try to do?.Can you please tell how to overcome this problem.If you need any logs I will share

Sekhar1 · November 8, 2018, 6:36pm

One more thing:

index.search.slowlog file showing search query time in micro sec but search fetch time in milliseconds.

Is their any way reduce search fetch time?

DavidTurner · November 8, 2018, 6:44pm

Given that Elasticsearch itself seems to be mostly idle, my first guess would be that the client is not sending requests fast enough. I'd recommend using Rally to benchmark your cluster to rule out any client-side problems first.

Sekhar1 · November 9, 2018, 12:13am

Thanks for Reply,

I am using Low level rest client for sending request in loop and using performRequest() its sync method.For each every request waiting for response.Client sending request but ES is taking to time process those request.If ES give response fast then client can able to send request also fast.

Note : I need to reduce search processing time in ES.If I able do than my Client very f

I try to use Rally for benchmarking but i am getting below error while starting time. I my system already JAVA8 available but its telling JAVA11.I am unable to understand root cause of the problem.
Rally|690x387

Logs :
2018-11-09 05:27:28,98 ActorAddr-(T|:20540)/PID:25348 esrally.actor INFO Received a benchmark failure from [ActorAddr-(T|:28682)] and will forward it now.
2018-11-09 05:27:28,94 ActorAddr-(T|:9331)/PID:25620 esrally.actor ERROR Cannot process message [<esrally.mechanic.mechanic.StartNodes object at 0x7efe74c26400>]
Traceback (most recent call last):

** File "/home/somasekhar.k/softwares/rally/esrally/mechanic/mechanic.py", line 545, in receiveMsg_StartNodes**
** msg.distribution, msg.external, msg.docker)**

** File "/home/somasekhar.k/softwares/rally/esrally/mechanic/mechanic.py", line 636, in create**
** s = supplier.create(cfg, sources, distribution, build, challenge_root_path, car, plugins)**

** File "/home/somasekhar.k/softwares/rally/esrally/mechanic/supplier.py", line 27, in create**
** java_home = _java_home(car)**

** File "/home/somasekhar.k/softwares/rally/esrally/mechanic/supplier.py", line 81, in _java_home**
** _, path = jvm.resolve_path(int(build_jdk))**

** File "/home/somasekhar.k/softwares/rally/esrally/utils/jvm.py", line 94, in resolve_path**
** return majors, _resolve_single_path(majors, sysprop_reader=sysprop_reader)**

** File "/home/somasekhar.k/softwares/rally/esrally/utils/jvm.py", line 139, in _resolve_single_path**
** format(specific_env_var, generic_env_var, major))**

esrally.exceptions.SystemSetupError: ('Neither JAVA11_HOME nor JAVA_HOME point to a JDK 11 installation.', None)

2018-11-09 05:27:28,100 -not-actor-/PID:25177 esrally.racecontrol ERROR A benchmark failure has occurred
2018-11-09 05:27:28,100 -not-actor-/PID:25177 esrally.racecontrol INFO Telling benchmark actor to exit.
2018-11-09 05:27:28,101 -not-actor-/PID:25177 esrally.rally INFO Attempting to shutdown internal actor system.
2018-11-09 05:27:28,101 ActorAddr-(T|:20540)/PID:25348 esrally.actor INFO BenchmarkActor received unknown message [ActorExitRequest] (ignoring).
2018-11-09 05:27:28,102 ActorAddr-(T|:28682)/PID:25485 esrally.actor INFO MechanicActor#receiveMessage unrecognized(msg = [<class 'thespian.actors.ActorExitRequest'>] sender = [ActorAddr-(T|:20540)])
2018-11-09 05:27:28,102 ActorAddr-(T|:20540)/PID:25348 esrally.actor INFO BenchmarkActor received unknown message [ActorExitRequest] (ignoring).
2018-11-09 05:27:28,103 ActorAddr-(T|:14120)/PID:25619 esrally.actor INFO mechanic.Dispatcher#receiveMessage unrecognized(msg = [<class 'thespian.actors.ActorExitRequest'>] sender = [ActorAddr-(T|:28682)])
2018-11-09 05:27:28,105 ActorAddr-(T|:14120)/PID:25619 esrally.actor INFO mechanic.Dispatcher#receiveMessage unrecognized(msg = [<class 'thespian.actors.ChildActorExited'>] sender = [ActorAddr-(T|:9331)])
2018-11-09 05:27:28,106 ActorAddr-(T|:28682)/PID:25485 esrally.actor ERROR Child actor exited with [ChildActorExited:ActorAddr-(T|:14120)] while in status [starting].
2018-11-09 05:27:28,109 ActorAddr-(T|:20540)/PID:25348 esrally.actor INFO BenchmarkActor received unknown message [ChildActorExited:ActorAddr-(T|:28682)] (ignoring).
2018-11-09 05:27:28,112 -not-actor-/PID:25347 root INFO ActorSystem Logging Shutdown
2018-11-09 05:27:28,115 -not-actor-/PID:25346 root INFO ---- Actor System shutdown
2018-11-09 05:27:28,116 -not-actor-/PID:25177 esrally.rally INFO Actor system is still running. Waiting...
2018-11-09 05:27:29,117 -not-actor-/PID:25177 esrally.rally INFO Shutdown completed.
2018-11-09 05:27:29,117 -not-actor-/PID:25177 esrally.rally ERROR Cannot run subcommand [race].
Traceback (most recent call last):
File "/home/somasekhar.k/softwares/rally/esrally/rally.py", line 454, in dispatch_sub_command
race(cfg)
File "/home/somasekhar.k/softwares/rally/esrally/rally.py", line 383, in race
with_actor_system(lambda c: racecontrol.run(c), cfg)
File "/home/somasekhar.k/softwares/rally/esrally/rally.py", line 404, in with_actor_system
runnable(cfg)
File "/home/somasekhar.k/softwares/rally/esrally/rally.py", line 383, in
with_actor_system(lambda c: racecontrol.run(c), cfg)
File "/home/somasekhar.k/softwares/rally/esrally/racecontrol.py", line 390, in run
raise e
File "/home/somasekhar.k/softwares/rally/esrally/racecontrol.py", line 387, in run
pipeline(cfg)
File "/home/somasekhar.k/softwares/rally/esrally/racecontrol.py", line 61, in call
self.target(cfg)
File "/home/somasekhar.k/softwares/rally/esrally/racecontrol.py", line 314, in from_sources_complete
return race(cfg, sources=True, build=True)
File "/home/somasekhar.k/softwares/rally/esrally/racecontrol.py", line 286, in race
raise exceptions.RallyError(result.message, result.cause)
esrally.exceptions.RallyError: (Neither JAVA11_HOME nor JAVA_HOME point to a JDK 11 installation., 'Traceback (most recent call last):\n File "/home/somasekhar.k/softwares/rally/esrally/mechanic/mechanic.py", line 545, in receiveMsg_StartNodes\n msg.distribution, msg.external, msg.docker)\n File "/home/somasekhar.k/softwares/rally/esrally/mechanic/mechanic.py", line 636, in create\n s = supplier.create(cfg, sources, distribution, build, challenge_root_path, car, plugins)\n File "/home/somasekhar.k/softwares/rally/esrally/mechanic/supplier.py", line 27, in create\n java_home = _java_home(car)\n File "/home/somasekhar.k/softwares/rally/esrally/mechanic/supplier.py", line 81, in _java_home\n _, path = jvm.resolve_path(int(build_jdk))\n File "/home/somasekhar.k/softwares/rally/esrally/utils/jvm.py", line 94, in resolve_path\n return majors, _resolve_single_path(majors, sysprop_reader=sysprop_reader)\n File "/home/somasekhar.k/softwares/rally/esrally/utils/jvm.py", line 139, in _resolve_single_path\n format(specific_env_var, generic_env_var, major))\nesrally.exceptions.SystemSetupError: ('Neither JAVA11_HOME nor JAVA_HOME point to a JDK 11 installation.', None)\n')

Note : I don't think so Low level rest client is an problem.

Please tell me how to upload logs files to discuss.elastic.co website.

DavidTurner · November 9, 2018, 9:36am

This means you're only sending one request at once, which seems unrealistic, and explains why Elasticsearch is hardly doing any work. I think you will get better performance sending requests in parallel.

I would ask this question in a separate thread with the Rally tag, because I am not a Rally expert. Java 11 is required for Elasticsearch development and perhaps this includes running Rally. The simplest thing to do would be to install Java 11.

Please could you format logs using the </> button and check they look ok with the preview before sending.

Sekhar1 · November 9, 2018, 10:27am

Thanks for you reply,

I sent 3 parallel request from 3 low level rest client but I get not better performance.Still My all threads waiting same place.Please see log

`12.0% (119.6ms out of 1s) cpu usage by thread 'elasticsearch[ESNode][search][T#2]'
 10/10 snapshots sharing following 10 elements
   sun.misc.Unsafe.park(Native Method)
   java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
   java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:737)
   java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:647)
   java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1269)
   org.elasticsearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:161)
   java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   java.lang.Thread.run(Thread.java:748)`

DavidTurner · November 9, 2018, 1:44pm

I do not understand the question. These threads are idle.

system · December 7, 2018, 1:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Got many thread waiting in ES TransportClient Elasticsearch	1	1357	July 5, 2017
Suddenly slow on EC2 Elasticsearch	9	422	July 6, 2017
Elasticsearch performance issue with load increasing reponse time is increasing Elasticsearch	5	436	July 6, 2017
Elasticsearch high cpu usage Elasticsearch	1	705	July 6, 2017
Certain rest requests time out Elasticsearch	15	478	July 6, 2017

ElasticSearch all Search Threads are waiting on take() method

Related topics