Hi !
I have set up my elastic stack (logstash, elastic, kibana) on a single node hosted on a "monster" : Redhat 7.3 server with a 48-core cpu , 64 GB of RAM and 6 TBs of disk.
You'll find below the output of the stat API.
For the moment, I create 2 indices on a daily basis :
- one of around 500 MB and 1 million documents
- the other is around 1,5 GB and 5 million documents
When sending requests through the Kibana interface, I frequently face performance issues :
- For example, when using the Kibana "Discover" page for 1 months of data ( +/- 150 millions of documents), I may have a time-out and receive this message "Discover: Request Timeout after 30000ms" (specially when the node is busy indexing) .
This error appear, even though I tried to "customize" some Kibana parameters (see below)
- Other kind of error I can find is after loading a Dashboard with multiple vizualisation elements (around 15 elements : tables, timelions, bar histograms ...) is this one : "Courier Fetch: 5 of 995 shards failed"
I didn't change the kibana configuration file, neither the elastic one (apart of some values that have nothing to do with performance).
My questions are the following :
A) Is there any configuration parameter to change to enhance these performances ?
( I have a rather big server. While indexing and launching a Dashboard, the 48 cores rarely exceed 900% as a total and the disks I/O do not seem to be the bottleneck).
B) Do I need to set up multiple nodes on the same server to see effective enhancements ?
And if so, should I setup different Virtual Machines for doing this ?
Thank you for your help !
Samia.
############## KIBANA Modified params
- visualization:loadingDelay : 2s ---> 5 s
- notifications:lifetime:banner : 3 000 000 ms --------> 3 000 000 000 ms
- notifications:lifetime:error : 300 000 ms --------> 300 000 000 ms
- notifications:lifetime:warning : 10 000 ms --------> 10 000 000 ms
- notifications:lifetime:info : 5 000 ms --------> 5 000 000 ms
- timelion:max_buckets : 2 000 --------> 86 400
- discover:sampleSize : 500 --------> 25
############## Cluster stats :
curl -XGET "http://localhost:9200/_cluster/stats?human&pretty"
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "afnic-bigdata",
"timestamp" : 1521559823361,
"status" : "yellow",
"indices" : {
"count" : 334,
"shards" : {
"total" : 1662,
"primaries" : 1662,
"replication" : 0.0,
"index" : {
"shards" : {
"min" : 1,
"max" : 5,
"avg" : 4.976047904191617
},
"primaries" : {
"min" : 1,
"max" : 5,
"avg" : 4.976047904191617
},
"replication" : {
"min" : 0.0,
"max" : 0.0,
"avg" : 0.0
}
}
},
"docs" : {
"count" : 685606639,
"deleted" : 4
},
"store" : {
"size" : "248.7gb",
"size_in_bytes" : 267056651055
},
"fielddata" : {
"memory_size" : "74.6mb",
"memory_size_in_bytes" : 78273712,
"evictions" : 0
},
"query_cache" : {
"memory_size" : "58.9mb",
"memory_size_in_bytes" : 61849365,
"total_count" : 267318,
"hit_count" : 89993,
"miss_count" : 177325,
"cache_size" : 9518,
"cache_count" : 9518,
"evictions" : 0
},
"completion" : {
"size" : "0b",
"size_in_bytes" : 0
},
"segments" : {
"count" : 19403,
"memory" : "1gb",
"memory_in_bytes" : 1139377503,
"terms_memory" : "925.4mb",
"terms_memory_in_bytes" : 970388727,
"stored_fields_memory" : "100.5mb",
"stored_fields_memory_in_bytes" : 105433776,
"term_vectors_memory" : "0b",
"term_vectors_memory_in_bytes" : 0,
"norms_memory" : "7.7mb",
"norms_memory_in_bytes" : 8077568,
"points_memory" : "30.8mb",
"points_memory_in_bytes" : 32325748,
"doc_values_memory" : "22mb",
"doc_values_memory_in_bytes" : 23151684,
"index_writer_memory" : "0b",
"index_writer_memory_in_bytes" : 0,
"version_map_memory" : "0b",
"version_map_memory_in_bytes" : 0,
"fixed_bit_set" : "0b",
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : 1521427810278,
"file_sizes" : { }
}
},
"nodes" : {
"count" : {
"total" : 1,
"data" : 1,
"coordinating_only" : 0,
"master" : 1,
"ingest" : 1
},
"versions" : [
"6.1.1"
],
"os" : {
"available_processors" : 48,
"allocated_processors" : 48,
"names" : [
{
"name" : "Linux",
"count" : 1
}
],
"mem" : {
"total" : "62.6gb",
"total_in_bytes" : 67266506752,
"free" : "2.2gb",
"free_in_bytes" : 2367606784,
"used" : "60.4gb",
"used_in_bytes" : 64898899968,
"free_percent" : 4,
"used_percent" : 96
}
},
"process" : {
"cpu" : {
"percent" : 5
},
"open_file_descriptors" : {
"min" : 4297,
"max" : 4297,
"avg" : 4297
}
},
"jvm" : {
"max_uptime" : "10.5h",
"max_uptime_in_millis" : 37819978,
"versions" : [
{
"version" : "1.8.0_131",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "25.131-b12",
"vm_vendor" : "Oracle Corporation",
"count" : 1
}
],
"mem" : {
"heap_used" : "3.4gb",
"heap_used_in_bytes" : 3710324344,
"heap_max" : "3.8gb",
"heap_max_in_bytes" : 4151836672
},
"threads" : 458
},
"fs" : {
"total" : "3.5tb",
"total_in_bytes" : 3935713746944,
"free" : "3.3tb",
"free_in_bytes" : 3667577057280,
"available" : "3.1tb",
"available_in_bytes" : 3467629658112
},
"plugins" : [ ],
"network_types" : {
"transport_types" : {
"netty4" : 1
},
"http_types" : {
"netty4" : 1
}
}
}
}