Elastic/Kibana performances issues

samia · March 20, 2018, 4:31pm

Hi !

I have set up my elastic stack (logstash, elastic, kibana) on a single node hosted on a "monster" : Redhat 7.3 server with a 48-core cpu , 64 GB of RAM and 6 TBs of disk.
You'll find below the output of the stat API.

For the moment, I create 2 indices on a daily basis :

one of around 500 MB and 1 million documents
the other is around 1,5 GB and 5 million documents

When sending requests through the Kibana interface, I frequently face performance issues :

For example, when using the Kibana "Discover" page for 1 months of data ( +/- 150 millions of documents), I may have a time-out and receive this message "Discover: Request Timeout after 30000ms" (specially when the node is busy indexing) .

This error appear, even though I tried to "customize" some Kibana parameters (see below)

Other kind of error I can find is after loading a Dashboard with multiple vizualisation elements (around 15 elements : tables, timelions, bar histograms ...) is this one : "Courier Fetch: 5 of 995 shards failed"

I didn't change the kibana configuration file, neither the elastic one (apart of some values that have nothing to do with performance).

My questions are the following :

A) Is there any configuration parameter to change to enhance these performances ?
( I have a rather big server. While indexing and launching a Dashboard, the 48 cores rarely exceed 900% as a total and the disks I/O do not seem to be the bottleneck).

B) Do I need to set up multiple nodes on the same server to see effective enhancements ?
And if so, should I setup different Virtual Machines for doing this ?

Thank you for your help !

Samia.

############## KIBANA Modified params

visualization:loadingDelay : 2s ---> 5 s
notifications:lifetime:banner : 3 000 000 ms --------> 3 000 000 000 ms
notifications:lifetime:error : 300 000 ms --------> 300 000 000 ms
notifications:lifetime:warning : 10 000 ms --------> 10 000 000 ms
notifications:lifetime:info : 5 000 ms --------> 5 000 000 ms
timelion:max_buckets : 2 000 --------> 86 400
discover:sampleSize : 500 --------> 25

############## Cluster stats :

curl -XGET "http://localhost:9200/_cluster/stats?human&pretty"
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "afnic-bigdata",
"timestamp" : 1521559823361,
"status" : "yellow",
"indices" : {
"count" : 334,
"shards" : {
"total" : 1662,
"primaries" : 1662,
"replication" : 0.0,
"index" : {
"shards" : {
"min" : 1,
"max" : 5,
"avg" : 4.976047904191617
},
"primaries" : {
"min" : 1,
"max" : 5,
"avg" : 4.976047904191617
},
"replication" : {
"min" : 0.0,
"max" : 0.0,
"avg" : 0.0
}
}
},
"docs" : {
"count" : 685606639,
"deleted" : 4
},
"store" : {
"size" : "248.7gb",
"size_in_bytes" : 267056651055
},
"fielddata" : {
"memory_size" : "74.6mb",
"memory_size_in_bytes" : 78273712,
"evictions" : 0
},
"query_cache" : {
"memory_size" : "58.9mb",
"memory_size_in_bytes" : 61849365,
"total_count" : 267318,
"hit_count" : 89993,
"miss_count" : 177325,
"cache_size" : 9518,
"cache_count" : 9518,
"evictions" : 0
},
"completion" : {
"size" : "0b",
"size_in_bytes" : 0
},
"segments" : {
"count" : 19403,
"memory" : "1gb",
"memory_in_bytes" : 1139377503,
"terms_memory" : "925.4mb",
"terms_memory_in_bytes" : 970388727,
"stored_fields_memory" : "100.5mb",
"stored_fields_memory_in_bytes" : 105433776,
"term_vectors_memory" : "0b",
"term_vectors_memory_in_bytes" : 0,
"norms_memory" : "7.7mb",
"norms_memory_in_bytes" : 8077568,
"points_memory" : "30.8mb",
"points_memory_in_bytes" : 32325748,
"doc_values_memory" : "22mb",
"doc_values_memory_in_bytes" : 23151684,
"index_writer_memory" : "0b",
"index_writer_memory_in_bytes" : 0,
"version_map_memory" : "0b",
"version_map_memory_in_bytes" : 0,
"fixed_bit_set" : "0b",
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : 1521427810278,
"file_sizes" : { }
}
},
"nodes" : {
"count" : {
"total" : 1,
"data" : 1,
"coordinating_only" : 0,
"master" : 1,
"ingest" : 1
},
"versions" : [
"6.1.1"
],
"os" : {
"available_processors" : 48,
"allocated_processors" : 48,
"names" : [
{
"name" : "Linux",
"count" : 1
}
],
"mem" : {
"total" : "62.6gb",
"total_in_bytes" : 67266506752,
"free" : "2.2gb",
"free_in_bytes" : 2367606784,
"used" : "60.4gb",
"used_in_bytes" : 64898899968,
"free_percent" : 4,
"used_percent" : 96
}
},
"process" : {
"cpu" : {
"percent" : 5
},
"open_file_descriptors" : {
"min" : 4297,
"max" : 4297,
"avg" : 4297
}
},
"jvm" : {
"max_uptime" : "10.5h",
"max_uptime_in_millis" : 37819978,
"versions" : [
{
"version" : "1.8.0_131",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "25.131-b12",
"vm_vendor" : "Oracle Corporation",
"count" : 1
}
],
"mem" : {
"heap_used" : "3.4gb",
"heap_used_in_bytes" : 3710324344,
"heap_max" : "3.8gb",
"heap_max_in_bytes" : 4151836672
},
"threads" : 458
},
"fs" : {
"total" : "3.5tb",
"total_in_bytes" : 3935713746944,
"free" : "3.3tb",
"free_in_bytes" : 3667577057280,
"available" : "3.1tb",
"available_in_bytes" : 3467629658112
},
"plugins" : [ ],
"network_types" : {
"transport_types" : {
"netty4" : 1
},
"http_types" : {
"netty4" : 1
}
}
}
}

warkolm · March 22, 2018, 5:56am

You probably have too many shards which will be placing memory pressure on the node.
Look to _shrink them and then change your indexing strategy to use less shards, or move to weekly/monthly indices.

samia · March 22, 2018, 1:55pm

Hello,

Thank you for your reply.

I shrinked one month of indices (changing them from 5 shards to 1), however while trying to view the results through the Discover page of Kibana, I didn't see a lot of changes. I created a special index pattern for these indices, in order to have only the shrinked indices, and the time responses are still long.

Will I see the difference, when all my indices will be shrinked , not before ? ie is the response time of any request depends of the total number of shards of the cluster and/ or the total number of index shards involved in an index template ?

Is there anything else to do to enhance the search response time, other than reducing the number of shards ?

By the way, I have another question related to this :
when using the Discover page of Kibana for a particular index template. Are all the indices matching the index pattern searched or is there a way to restrict very early in the process the search to 1) the indices involved in the index pattern and 2) those which are in the time window defined by the time picker of kibana ?

Thank you for your help.

system · April 19, 2018, 1:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance issues that make no sense Kibana	10	1245	July 28, 2017
Kibana dreaded timeout on dashboard Kibana	24	18485	April 19, 2017
Performance issue on Kibana Kibana	14	2185	November 20, 2018
Performance issue on single node instance Elasticsearch	2	1228	April 17, 2017
Performance Issues and timeouts with Elasticsearch Elasticsearch	5	5944	January 11, 2017

Elastic/Kibana performances issues

Related topics