Architecting cluster for fast searching


I am trying to architect a cluster optimized for search speed. Right now we have three dedicated data nodes on bare metal, each with 64 cores, 755GB RAM, and 15TB SSD's along with 3 dedicated master nodes on VMs (4 cores, 32GB RAM). We index ~1600GB of data into 1 index, with 6 shards and 1 replica (~130GB per shard). We are willing to compromise indexing speed to maximize search speed.

We are currently seeing response times up to 5000ms at the 99th percentile on our queries. We are wondering if there are any immediate red flags in our architecture. Please let me know if more information is required to give proper advice.

Would it be best to eliminate the dedicated master nodes on VM's and just run the 3 bare metal servers as master/data nodes?

How can we maximize CPU and memory usage on these massive data nodes? Per the elasticsearch documentation, we are only allocating 31GB of heap to elastic. Our telemetry would indicate that the data nodes are not even coming close to being maxed out on cpu, memory, or I/O.

Any and all advice is highly appreciated!