I have a single node elastic cluster with replication turned off. I have significant amount of data in the system (about 300MM records). When I load my Visualizations they load but are slow. However when I try to open dashboards, they not only take much longer (I understand due to multiple connections to the underlying visualizations) but they also error out and put ElasticSearch in a red status. I have a pretty powerful system with 8C/64GB/4.8TB SSD. What part of configuration would I have to change in order to fix this issue, I am assuming that 300MM is not too much for Elastic to handle on one node. So something else must be wrong. Any advice is welcome.
{
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "dataseers",
"timestamp": 1531601516167,
"status": "yellow",
"indices": {
"count": 33,
"shards": {
"total": 45,
"primaries": 45,
"replication": 0,
"index": {
"shards": {
"min": 1,
"max": 5,
"avg": 1.3636363636363635
},
"primaries": {
"min": 1,
"max": 5,
"avg": 1.3636363636363635
},
"replication": {
"min": 0,
"max": 0,
"avg": 0
}
}
},
"docs": {
"count": 390889190,
"deleted": 61556174
},
"store": {
"size": "337.8gb",
"size_in_bytes": 362790191389
},
"fielddata": {
"memory_size": "5.5mb",
"memory_size_in_bytes": 5794136,
"evictions": 0
},
"query_cache": {
"memory_size": "0b",
"memory_size_in_bytes": 0,
"total_count": 0,
"hit_count": 0,
"miss_count": 0,
"cache_size": 0,
"cache_count": 0,
"evictions": 0
},
"completion": {
"size": "0b",
"size_in_bytes": 0
},
"segments": {
"count": 566,
"memory": "809.3mb",
"memory_in_bytes": 848656687,
"terms_memory": "650mb",
"terms_memory_in_bytes": 681591121,
"stored_fields_memory": "118.8mb",
"stored_fields_memory_in_bytes": 124644680,
"term_vectors_memory": "0b",
"term_vectors_memory_in_bytes": 0,
"norms_memory": "1mb",
"norms_memory_in_bytes": 1053824,
"points_memory": "34.9mb",
"points_memory_in_bytes": 36686758,
"doc_values_memory": "4.4mb",
"doc_values_memory_in_bytes": 4680304,
"index_writer_memory": "4.9mb",
"index_writer_memory_in_bytes": 5165080,
"version_map_memory": "1.5mb",
"version_map_memory_in_bytes": 1654897,
"fixed_bit_set": "12.6kb",
"fixed_bit_set_memory_in_bytes": 12960,
"max_unsafe_auto_id_timestamp": 1531531162502,
"file_sizes": {}
}
},
"nodes": {
"count": {
"total": 1,
"data": 1,
"coordinating_only": 0,
"master": 1,
"ingest": 1
},
"versions": [
"6.3.1"
],
"os": {
"available_processors": 16,
"allocated_processors": 16,
"names": [
{
"name": "Linux",
"count": 1
}
],
"mem": {
"total": "62.5gb",
"total_in_bytes": 67126927360,
"free": "37.9gb",
"free_in_bytes": 40721190912,
"used": "24.5gb",
"used_in_bytes": 26405736448,
"free_percent": 61,
"used_percent": 39
}
},
"process": {
"cpu": {
"percent": 0
},
"open_file_descriptors": {
"min": 927,
"max": 927,
"avg": 927
}
},
"jvm": {
"max_uptime": "19.5h",
"max_uptime_in_millis": 70368444,
"versions": [
{
"version": "1.8.0_171",
"vm_name": "OpenJDK 64-Bit Server VM",
"vm_version": "25.171-b10",
"vm_vendor": "Oracle Corporation",
"count": 1
}
],
"mem": {
"heap_used": "4.6gb",
"heap_used_in_bytes": 4962910312,
"heap_max": "7.8gb",
"heap_max_in_bytes": 8476557312
},
"threads": 229
},
"fs": {
"total": "4.2tb",
"total_in_bytes": 4708418715648,
"free": "3.7tb",
"free_in_bytes": 4165199818752,
"available": "3.7tb",
"available_in_bytes": 4165199818752
},
"plugins": [],
"network_types": {
"transport_types": {
"security4": 1
},
"http_types": {
"security4": 1
}
}
}
}
That looks fine. Is there anything in the Elasticsearch logs?
Unfortunately it has not broken since I posted this. I am trying to see if multiple users hitting the system at the same time is causing it. I will post the logs as soon as it fails.
Broke again
[2018-07-25T10:02:46,072][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_logstash_version_mismatch], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:02:46,072][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_kibana_version_mismatch], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:02:46,072][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_elasticsearch_nodes], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:02:46,072][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_xpack_license_expiration], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:02:46,072][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_elasticsearch_cluster_status], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:02:46,072][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_elasticsearch_version_mismatch], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:02:46,072][ERROR][o.e.x.m.c.n.NodeStatsCollector] [node-1] collector [node_stats] timed out when collecting data
[2018-07-25T10:02:46,076][WARN ][o.e.x.w.e.ExecutionService] [node-1] failed to execute watch [J8OVNfqLSe2NW0TOJPOhzw_logstash_version_mismatch]
[2018-07-25T10:02:46,086][WARN ][o.e.m.j.JvmGcMonitorService] [node-1] [gc][old][1050][14] duration [26.5s], collections [1]/[27.1s], total [26.5s]/[4.8m], memory [15.8gb]->[15.8gb]/[15.8gb], all_pools {[young] [865.3mb]->[865.3mb]/[865.3mb]}{[survivor] [51.3mb]->[57.2mb]/[108.1mb]}{[old] [14.9gb]->[14.9gb]/[14.9gb]}
[2018-07-25T10:02:46,086][WARN ][o.e.m.j.JvmGcMonitorService] [node-1] [gc][1050] overhead, spent [26.5s] collecting in the last [27.1s]
[2018-07-25T10:03:09,939][ERROR][o.e.x.m.c.i.IndexStatsCollector] [node-1] collector [index-stats] timed out when collecting data
[2018-07-25T10:03:09,946][WARN ][o.e.m.j.JvmGcMonitorService] [node-1] [gc][old][1051][15] duration [23.5s], collections [1]/[23.8s], total [23.5s]/[5.2m], memory [15.8gb]->[15.8gb]/[15.8gb], all_pools {[young] [865.3mb]->[865.3mb]/[865.3mb]}{[survivor] [57.2mb]->[74.2mb]/[108.1mb]}{[old] [14.9gb]->[14.9gb]/[14.9gb]}
[2018-07-25T10:03:09,946][WARN ][o.e.m.j.JvmGcMonitorService] [node-1] [gc][1051] overhead, spent [23.5s] collecting in the last [23.8s]
[2018-07-25T10:03:36,701][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_kibana_version_mismatch], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:03:36,701][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_logstash_version_mismatch], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:03:36,701][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_xpack_license_expiration], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:03:36,702][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_elasticsearch_nodes], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:03:36,702][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_elasticsearch_version_mismatch], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:03:36,703][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [node-1] collector [cluster_stats] timed out when collecting data
[2018-07-25T10:03:36,702][ERROR][o.e.x.w.i.s.ExecutableSearchInput] [node-1] failed to execute [search] input for watch [J8OVNfqLSe2NW0TOJPOhzw_elasticsearch_cluster_status], reason [java.util.concurrent.TimeoutException: Timeout waiting for task.]
[2018-07-25T10:03:36,704][WARN ][o.e.m.j.JvmGcMonitorService] [node-1] [gc][old][1052][16] duration [26.5s], collections [1]/[26.7s], total [26.5s]/[5.6m], memory [15.8gb]->[15.8gb]/[15.8gb], all_pools {[young] [865.3mb]->[865.3mb]/[865.3mb]}{[survivor] [74.2mb]->[84.8mb]/[108.1mb]}{[old] [14.9gb]->[14.9gb]/[14.9gb]}
[2018-07-25T10:03:36,705][WARN ][o.e.m.j.JvmGcMonitorService] [node-1] [gc][1052] overhead, spent [26.5s] collecting in the last [26.7s]
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.