Hi @jdbcplusnet,
Take a look in Stack Monitoring | Overview. That should show you your search rate, and your search latency. That should give you an idea if your searches are the bottle neck. Then jump over to the Nodes tab and look at the cpu, load average, and jvm heap usage for all the nodes. See if anything looks off with those. You can click into the node for more detailed information. Finally, hopefully your kibana instances are in Stack Monitoring as well. If they are click into them and see if you see any performance problems. You might need to increase resources at any of the points mentioned above. Two three thoughts:
If your shards are too large (>100GB) they can slow down query performance. Ideally shot for around 50GB/shard.
Sometimes the client that is being used can be the bottleneck. We had a raspberry pi reloading a dashboard a long time ago and eventually had to stop because it just couldn't handle the rendering load.
Also, as you know 6.8.7 is EOL (end of life) so upgrading to the latest supported version you can is recommended.
Could you explain further the point 2 "Sometimes the client that is being used can be the bottleneck".
We have Kibana Dashboard / Visualizations refreshing Elasticsearch data each 30 seconds and several Spring-Boot microservices writing / reading to Elasticsearch.
Sure, this is more antidotal, but about 4 years ago we had a raspberry-pi (a) loading a pretty big dashboard every 30 seconds or so as well. As you can imagine the little computer running wasn't really powerful enough to handle that. Client in this case means web browser. It had nothing really to do with the dashboard (which returned results quickly when viewing on a more traditional computer). Just trying to say if you have a crazy big dashboard and are using an older machine like a raspberry pi (version a) you might see degraded performance, but that has to do with the computer more than anything else.
I can't really speak to the node js idea, but am curious what the numbers look like in Stack Monitoring when these slow queries are running. Could you provide that information? Final thought would be look at what elastic says about running kibana in production environments.
Nice, that's good information. Thanks! Seems like there is a good relationship between HTTP connections and client response time. Doesn't look like kibana memory is a factor. Next step I think would be to see what the Cluster Overview looks like during the same time period. Like to see what the four big metrics look like: Search Rate, Search Latency, Indexing Rate, Indexing Latency. If there are no smoking guns there. We'll then want to look at each node in the cluster for that time period (from the Nodes tab in Stack Monitoring). Basically want to make sure all the nodes in the cluster have all the resources they need. Maybe there is one node that has a constraint and it is slowing down when the load goes up.
Ok, so don't see anything too off about those statistics, but a few thoughts:
It looks like you are running 3 nodes, so your master and data nodes are running on the same hosts. If so you should consider separating those out and let your master nodes be fully dedicated to that purpose.
Besides the previous advice above I'd take a look at slow logs as well to see if that turns anything up.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.