I have 2 physical machines where I run Elastic and Kibana on
120 - Master Node and a Data Node
121 - Data Node
I have 8C/64GB Ram and 4TB of hard drive space on them.
the clusters are setup and the indices have replicated on both 120 and 121.
My issues is on Kibana. Should kibana run on 120 and 121? Or just 120 or 121? Also I get the 30000ms timeout errors because I have approximately 3 billion transactions in Elastic that people run searches/dashboards on. And quite a bit of times, especially if 2 or more users are doing something, the system times out. How do I solve this issue?
So you're a mainly RO use case, merge your segments to a reasonable degree post load - merging to a single segment will probably take too long with 64gb RAM and a HDD
You could increase the search timeout (if acceptable)
If not, then you need to either optimise your mappings/indices or add more hardware
My seed index is about 800GB. It's the largest one. Has 1.1 billion transactions. The rest are around 10 million each per day and are about 10gb in size.
At the moment I only have 2 total indices. Shards setting is at default. So not entirely sure.
I can see that you have one index with a single primary shard of 894.1GB. The replica is the same size. This is likely going to be very slow to query as each query is single-threaded against each shard. I would recommend you reindex this into a new index with e.g. 30 primary shards. When you do so I would recommend you also set the number_of_routing_shards to e.g. 120 so you later on can use the split index API if needed. This assume you are on a reasonablt recent version.
Will post in a minute, just to play with some settings I am trying to reindex and change the default to 5 shards and 2 replicas. I have 2 physical nodes. I want to see if the performance changes if the index is sharded. I understand that 1TB is a fairly large and has to be reindexed.
The split is for the really large index only. You want to avoid having lots of really small indices so try to keep the shard size above 10GB for time-based indices.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.