Elastic and Kibana Load Balancing

I have 2 physical machines where I run Elastic and Kibana on

120 - Master Node and a Data Node
121 - Data Node

I have 8C/64GB Ram and 4TB of hard drive space on them.

the clusters are setup and the indices have replicated on both 120 and 121.

My issues is on Kibana. Should kibana run on 120 and 121? Or just 120 or 121? Also I get the 30000ms timeout errors because I have approximately 3 billion transactions in Elastic that people run searches/dashboards on. And quite a bit of times, especially if 2 or more users are doing something, the system times out. How do I solve this issue?

I'd run it on 121, as you have the master using memory on 120

What size is your heap?

Heap is set at 28GB. Also note that data ONLY gets added to Elastic once a day! About 10 million records inserted.

So you're a mainly RO use case, merge your segments to a reasonable degree post load - merging to a single segment will probably take too long with 64gb RAM and a HDD

You could increase the search timeout (if acceptable)

If not, then you need to either optimise your mappings/indices or add more hardware

How large are your indices? How many indices and shards do you have in the cluster?

What's an RO? and I do have SSDs sorry for not clarifying earlier.

My seed index is about 800GB. It's the largest one. Has 1.1 billion transactions. The rest are around 10 million each per day and are about 10gb in size.

At the moment I only have 2 total indices. Shards setting is at default. So not entirely sure.

Can you provide the output of the _cat/indices API? It sounds like you have quite large shards, which can impact query performance.

What type of queries are you running? What does CPU usage, disk I/O and iowait look like while you are querying?

Please don't post images of text as they are hardly readable and not searchable.

Instead paste the text and format it with </> icon. Check the preview window.

I can see that you have one index with a single primary shard of 894.1GB. The replica is the same size. This is likely going to be very slow to query as each query is single-threaded against each shard. I would recommend you reindex this into a new index with e.g. 30 primary shards. When you do so I would recommend you also set the number_of_routing_shards to e.g. 120 so you later on can use the split index API if needed. This assume you are on a reasonablt recent version.

Will post in a minute, just to play with some settings I am trying to reindex and change the default to 5 shards and 2 replicas. I have 2 physical nodes. I want to see if the performance changes if the index is sharded. I understand that 1TB is a fairly large and has to be reindexed.

We are on the latest and greatest 7.2!

Then you MAY be able to use the split index API without reindexing.

Wow 30 primaries?

Yes, that will get you down to 30GB per shard. You could go a bit bigger, but that will allow you to grow for a while.

So you would just change "number_of_shards" : 50 and "number_of_replicas" : 2 (is 2 enough?)

we split daily indexes into smaller bits and they are generally 10GB per day so 30 or 50 shards would mean 500MB indexes, would that be an issue?

First check what the index settings of the large index is.

The split is for the really large index only. You want to avoid having lots of really small indices so try to keep the shard size above 10GB for time-based indices.