Routing performance tuning

I have a 10 node es cluster plus 3 masters. We are running 2.0. 8CPU, 64G Memory Virtual environment. We have about a billion records in the cluster. We are loading in the low hundred million per day and keep about 5 days of data. Each day is it's own index with about 250 million recs. We create an alias for last 5 days. We have about 4 million or so unique groups (group_id) for this entire 5 day set which we use to "route" records. 90% of our queries utilize the groupID so this is a good key/route for us. The problem is that our queries still take about 4-5 seconds to run. The average record result set is about 250 records. We have 100 shards per index so the 5day has 500 shards.

I was thinking about increasing the shard number to 200 per index so that each shard gets cut in half...thinking that with the routing it might reduce the response time. We have 10 additional nodes that we could add to the cluster which we are planning on doing which should help.

However, I Just wanted to check on the best design for what were doing. Is there a way to "tune" the shard routing table? Could their be contention here...or is it more likely that with just 250 million records, data is the problem. It seems as though if we route successfully and go to a single shard that this shouldn't take 5 seconds to return. Would it make sense to increase shards to like 500 per index with the assumption that most/all queries are "route" based so that this would help because shards are alot smaller. The problem here might be queries without a route could really pound the cluster because of the number of shards.

Any help/thoughts would be appreciated.

This sounds a bit excessive, how many GB is a single shard?

Can you share typical queries and try to get the node got threads while you are running queries to get an idea of the bottleneck of your queries? Nodes hot threads API | Elasticsearch Guide [8.11] | Elastic

Adrien is being nice here, that's outright crazy excessive.

Thanks guys, When you say excessive your talking about the number of total shards.
We're trying to find the sweet spot.

The thinking here is that currently 100% of our queries ALL have a route associated with them. (GROUP ID is required) So we will know which shard to go to....SO why not have LOTS of shards so that when it gets re-directed to the shard the shard is smaller and better response time. I think of it in database terms as basically partitioning the index.

Is that not good thinking here? I understand that OTHER queries that don't use a route might have some performance/contention issues but right now we don't have those requirements yet.

The SHARD size is only about 2GB a piece. I'm going to try and find the sweet spot but if you guys were setting this up considering ALL queries would have a Route, how many shards would you start with? (20 node cluster, 2GB shard size (Based on 100 shards) , 1 billion total records, 5 million unique group ID's)

I'd have 20 shards, one per node.
Shard actions are multithreaded after all.