Field Cache or best approach for selecting on a single value

I have a 10 node ElasticSeach cluster where I have about 150 million records per day in a daily index. I utilize an alias to point to the last 2 weeks of data (A couple weeks worth of data close to 2 billion records). Each index has 10 shards. I have 64G Memory and 8CPU on each ACserver. (virtualized).

I have parallel loaders running on a separate server and it makes connections to all 10 servers to do bulk loads. I also have 3 master servers but not real sure how to utilize them.

My record size is about 60 fields per record or approx 1.5K per record.

I have basically a groupUUID field in the data that typically pulls back a few hundred records for each group. I need to query all 2 weeks of data.

90% of my queries are simply going to be what is equivilent to
Select * from table where groupUUID = 'xyz';

I'm finding it's taking about 20 seconds or so for each of my queries to run.

Is there a performance change that I could make that would be helpful in my situation assumming my basic query is an exact query for a groupUUID. Can I cache the groupUUID field/index? How would I do this?

Currently, I have set all my field to "non_analyzed" because I don't really need full text search or anything. My query basically looks like this

{
"query": { 
   "filtered" : {
      "filter" : {
        "term": {"groupUUID" : "xyz"}}
  }
 }
}

I'm hoping to get "sub-second" response times. Any advice on how to tune/cache my cluster?

How many unique groupUUIDs do you approximately have in the cluster? If this is a large number, you might benefit from using routing with the groupUUID as a routing key when indexing and querying in order to ensure a minimal number of shards need to be queried for the type of queries you described. By using a routing key, you ensure that all records belonging to a single groupUUID are routed to the same shard within the index. If you have few groupUUIDs or an uneven distribution of documents per groupUUID you can end up with unbalanced shards, which is generally bad. Note that you can still run queries against all shards without specifying a routing key.

You have immutable data so the request cache could help.

Like @Christian_Dahlqvist says, routing could be a thing.

How much data are you pulling back at a time when you do this? Pulling back lots of _source documents is slow because they are stored in compressed chunks and you have to uncompress the chunk until you get to them. If you are pulling back a lot of data at once the best thing is usually to rethink what you want as an aggregation. You can set the size=0 and not get any actual documents back, skipping the chunked, compressed storage. Aggregations instead use column wise storage.

Thanks for the response, There average number of records coming back are a few hundred. So for a couple billion records which is what we will probably have there could be maybe 5-6 million unique groupID's.

A few hundred per response means stores field loading is probably not a
huge problem. It is probably worth looking at the hot threads while doing
the query or the profile API if you have it in your version (I forget, it
might only be in alphas of 5.0)