Elastic/Kibana performances issues

Hi !

I have set up my elastic stack (logstash, elastic, kibana) on a single node hosted on a "monster" : Redhat 7.3 server with a 48-core cpu , 64 GB of RAM and 6 TBs of disk.
You'll find below the output of the stat API.

For the moment, I create 2 indices on a daily basis :

  • one of around 500 MB and 1 million documents
  • the other is around 1,5 GB and 5 million documents

When sending requests through the Kibana interface, I frequently face performance issues :

  1. For example, when using the Kibana "Discover" page for 1 months of data ( +/- 150 millions of documents), I may have a time-out and receive this message "Discover: Request Timeout after 30000ms" (specially when the node is busy indexing) .

This error appear, even though I tried to "customize" some Kibana parameters (see below)

  1. Other kind of error I can find is after loading a Dashboard with multiple vizualisation elements (around 15 elements : tables, timelions, bar histograms ...) is this one : "Courier Fetch: 5 of 995 shards failed"

I didn't change the kibana configuration file, neither the elastic one (apart of some values that have nothing to do with performance).

My questions are the following :

A) Is there any configuration parameter to change to enhance these performances ?
( I have a rather big server. While indexing and launching a Dashboard, the 48 cores rarely exceed 900% as a total and the disks I/O do not seem to be the bottleneck).

B) Do I need to set up multiple nodes on the same server to see effective enhancements ?
And if so, should I setup different Virtual Machines for doing this ?

Thank you for your help !

Samia.

############## KIBANA Modified params

  • visualization:loadingDelay : 2s ---> 5 s
  • notifications:lifetime:banner : 3 000 000 ms --------> 3 000 000 000 ms
  • notifications:lifetime:error : 300 000 ms --------> 300 000 000 ms
  • notifications:lifetime:warning : 10 000 ms --------> 10 000 000 ms
  • notifications:lifetime:info : 5 000 ms --------> 5 000 000 ms
  • timelion:max_buckets : 2 000 --------> 86 400
  • discover:sampleSize : 500 --------> 25

############## Cluster stats :

curl -XGET "http://localhost:9200/_cluster/stats?human&pretty"
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "afnic-bigdata",
"timestamp" : 1521559823361,
"status" : "yellow",
"indices" : {
"count" : 334,
"shards" : {
"total" : 1662,
"primaries" : 1662,
"replication" : 0.0,
"index" : {
"shards" : {
"min" : 1,
"max" : 5,
"avg" : 4.976047904191617
},
"primaries" : {
"min" : 1,
"max" : 5,
"avg" : 4.976047904191617
},
"replication" : {
"min" : 0.0,
"max" : 0.0,
"avg" : 0.0
}
}
},
"docs" : {
"count" : 685606639,
"deleted" : 4
},
"store" : {
"size" : "248.7gb",
"size_in_bytes" : 267056651055
},
"fielddata" : {
"memory_size" : "74.6mb",
"memory_size_in_bytes" : 78273712,
"evictions" : 0
},
"query_cache" : {
"memory_size" : "58.9mb",
"memory_size_in_bytes" : 61849365,
"total_count" : 267318,
"hit_count" : 89993,
"miss_count" : 177325,
"cache_size" : 9518,
"cache_count" : 9518,
"evictions" : 0
},
"completion" : {
"size" : "0b",
"size_in_bytes" : 0
},
"segments" : {
"count" : 19403,
"memory" : "1gb",
"memory_in_bytes" : 1139377503,
"terms_memory" : "925.4mb",
"terms_memory_in_bytes" : 970388727,
"stored_fields_memory" : "100.5mb",
"stored_fields_memory_in_bytes" : 105433776,
"term_vectors_memory" : "0b",
"term_vectors_memory_in_bytes" : 0,
"norms_memory" : "7.7mb",
"norms_memory_in_bytes" : 8077568,
"points_memory" : "30.8mb",
"points_memory_in_bytes" : 32325748,
"doc_values_memory" : "22mb",
"doc_values_memory_in_bytes" : 23151684,
"index_writer_memory" : "0b",
"index_writer_memory_in_bytes" : 0,
"version_map_memory" : "0b",
"version_map_memory_in_bytes" : 0,
"fixed_bit_set" : "0b",
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : 1521427810278,
"file_sizes" : { }
}
},
"nodes" : {
"count" : {
"total" : 1,
"data" : 1,
"coordinating_only" : 0,
"master" : 1,
"ingest" : 1
},
"versions" : [
"6.1.1"
],
"os" : {
"available_processors" : 48,
"allocated_processors" : 48,
"names" : [
{
"name" : "Linux",
"count" : 1
}
],
"mem" : {
"total" : "62.6gb",
"total_in_bytes" : 67266506752,
"free" : "2.2gb",
"free_in_bytes" : 2367606784,
"used" : "60.4gb",
"used_in_bytes" : 64898899968,
"free_percent" : 4,
"used_percent" : 96
}
},
"process" : {
"cpu" : {
"percent" : 5
},
"open_file_descriptors" : {
"min" : 4297,
"max" : 4297,
"avg" : 4297
}
},
"jvm" : {
"max_uptime" : "10.5h",
"max_uptime_in_millis" : 37819978,
"versions" : [
{
"version" : "1.8.0_131",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "25.131-b12",
"vm_vendor" : "Oracle Corporation",
"count" : 1
}
],
"mem" : {
"heap_used" : "3.4gb",
"heap_used_in_bytes" : 3710324344,
"heap_max" : "3.8gb",
"heap_max_in_bytes" : 4151836672
},
"threads" : 458
},
"fs" : {
"total" : "3.5tb",
"total_in_bytes" : 3935713746944,
"free" : "3.3tb",
"free_in_bytes" : 3667577057280,
"available" : "3.1tb",
"available_in_bytes" : 3467629658112
},
"plugins" : [ ],
"network_types" : {
"transport_types" : {
"netty4" : 1
},
"http_types" : {
"netty4" : 1
}
}
}
}

You probably have too many shards which will be placing memory pressure on the node.
Look to _shrink them and then change your indexing strategy to use less shards, or move to weekly/monthly indices.

Hello,

Thank you for your reply.

I shrinked one month of indices (changing them from 5 shards to 1), however while trying to view the results through the Discover page of Kibana, I didn't see a lot of changes. I created a special index pattern for these indices, in order to have only the shrinked indices, and the time responses are still long.

Will I see the difference, when all my indices will be shrinked , not before ? ie is the response time of any request depends of the total number of shards of the cluster and/ or the total number of index shards involved in an index template ?

Is there anything else to do to enhance the search response time, other than reducing the number of shards ?

By the way, I have another question related to this :
when using the Discover page of Kibana for a particular index template. Are all the indices matching the index pattern searched or is there a way to restrict very early in the process the search to 1) the indices involved in the index pattern and 2) those which are in the time window defined by the time picker of kibana ?

Thank you for your help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.