Hello,
We are currently in the process of moving from an ES 0.9 cluster to an ES
1.4 cluster. Both clusters are in Amazon Ec2.
Before doing so, we need to index a lot of indexes to the ES 0.9 cluster
first. The nodes in this cluster are all m3.2xlarge machines (8 cores, 30G
of memory). In general the nodes in this cluster are having an average
processor load of 3% (so no problems at all there). The nodes are newly
created from the image, so we can assume that they are clean.
The problem arises when we are going to do bulk requests. Whenever the
distribution of the threads on one node is around 1/8 of the total of the
processors, latency on the cluster goes up from 3,5ms to 100's of ms in
average.
When I do a *top *the threads are all divided over all the processors. All
processors can have 800% of load if you add it up, but whenever the
addition of percentages of all cores reaches 100%, it immediately starts
throttling (making other requests very slow).
Question
Does anybody have experience with this situation and if yes, is there a way
to easily fix this?
Example of what I see in top:
Cpu0 : 3.7%us, 0.3%sy, 0.0%ni, 96.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu1 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu2 : 1.0%us, 0.3%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu3 : 0.7%us, 0.0%sy, 0.0%ni, 99.0%id, 0.3%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu4 : 1.7%us, 0.0%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu5 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu6 : 3.0%us, 0.3%sy, 0.0%ni, 96.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu7 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 30764132k total, 30613104k used, 151028k free, 129224k buffers
Swap: 0k total, 0k used, 0k free, 12410696k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24584 elastics 20 0 19.5g 15g 119m S 14.6 53.2 4351:41 java
Other cases
This problem appears in exactly the same way on 4 core instances and 2 core
instances. A respective 1/4 and 1/2 total load of processors causes it to
have a really high latency
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a63eb62e-8437-477c-b379-c3fdf8a21a37%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.