Not stable response time of elastic cluster

60 shards 3 nodes elastic cluster, response time is pretty volatile from 20ms to 300ms, how could I make response time shorter and more stable?

What are you doing? What is your hardware like?
What is running on the machine?

normal query,hardware is 8core 16G 3nodes, machine is only running es as a search engine.

es stores many questions and answers, input a queston, and search for a similar question

Like the following?

GET /index/_search

Or a bit more complex than a match_all?

SSD drives? Or spinning disks?

What is the output of:

GET /
GET /_cat/nodes?v
GET /_cat/health?v
GET /_cat/indices?v

What is the typical response (the full one please) of a "slow query"? And what was the request?

Also have a look at this great blog post about Slow Queries. That might help.

curl -X GET "http://localhost:8080/_cat/nodes?v"

ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.30.252.56           28          40   8    1.99    3.28     3.72 mdi       -      node-56
10.30.252.55           51          55  17    4.91    6.51     7.49 mdi       *      node-55
10.30.252.54           51          54  16    4.08    6.71     7.40 mdi       -      node-54

curl -X GET "http://localhost:8080/_cat/health?v"

epoch      timestamp cluster      status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1563276185 19:23:05  es yellow          3         3    750 250    0    0       10             0                  -                 98.7%

curl -X GET "http://localhost:8080/_cat/indices?v"

health status index                 uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   all_with_routing_36_0 -eZAYbBcRqKa6WnjTItJLg  60   2     210232            0    561.2mb          187mb
green  open   all_36_1              tsVykN1GQYe7tScV93ZA0A  60   2        473            0      4.1mb          1.3mb
green  open   all_36_0              F_1uZJ02SzKScZw9wQQMIw  60   2     210232            0    734.3mb        244.7mb
yellow open   tritalkdoctest0627    cDH2mEicQNGuVkJ-prgxjg   5   3        119            0      1.1mb          390kb
yellow open   tritalkdoctest0715    QgUYx6BKTvCdIJc1TiOQqg   5   3        119            0      1.1mb        393.6kb
green  open   all_126_1             tHkz0rr-SO2EMKPp8ffL8g  60   2        677            0      4.9mb          1.6mb

curl -X GET "http://localhost:8080/"

{
  "name" : "node-54",
  "cluster_name" : "es",
  "cluster_uuid" : "5pxKPv4dTsajsfYabzx_2Q",
  "version" : {
    "number" : "6.3.0",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "424e937",
    "build_date" : "2018-06-11T23:38:03.357887Z",
    "build_snapshot" : false,
    "lucene_version" : "7.3.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

spinning disk, query like GET /index/_search

60 shards is too much IMO.
1 shard is enough here.

spinning disk

Switch to SSD drives.

Also pasting again one of my questions:

What is the typical response (the full one please) of a "slow query"? And what was the request?

Please read this about how to format.

Also note that "6.3.0" has been built more than a year ago.
Lot of improvements have been done in the meantime. Could you upgrade to 7.2.0?

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.