We have an application which uses 5.1.2 java client with transport protocol and a 5.1.2 cluster. The app measures elastic response times and some get operations return in the order of tens of seconds.
An example shows 38036.18004 ms response time for this index for a simple document get:
curl -s 'http://localhost:9200/_cat/shards?v' | grep 'accounts_2 ' accounts_2 0 r STARTED 2366203 745.4mb 10.6.145.198 fe05 accounts_2 0 p STARTED 2366203 738.2mb 10.6.145.195 fe02 accounts_2 0 r STARTED 2366203 740.4mb 10.6.145.197 fe04
The segments look like this (the index was created in 2.4 and later upgraded to 5.1):
curl -s 'http://localhost:9200/_cat/segments?v' | egrep '^index|accounts_2 '
I wonder: how can a document get last 38 seconds? At that time, the cluster was green.
There are other queries running while this happens and the statistical breakdown looks OK:
N Min Max Median Avg Stddev x 2574735 0.328462 38036.18 1.101684 8.7738861 126.71908
Meaning from nearly 2.6M doc gets the slowest was 38 seconds, but the median shows a healthy 1.1 ms, so there are only some (but a constant amount of) slow queries.
While running on ES2.4, there weren't any such slow queries. The average was nearly the same as the median.