Slow searches on a cluster

So i'm making a benchmark for performance tests. I have a cluster with 4 machines -> all virtual, all with 64GB RAM and 1.5TB Memory. Only one of those 4 machines is master node.
I Indexed 2 bilion records (each record is 1kb) with 20 shards and replica (so i have now 40 shards)
each shard is 31.1gb size.
The records I indexed has many fields, two of them is id and name.
Now i'm trying to aggragate on id, and sub-aggregate on the name field
this query is 30 seconds long, and when i'm in kibana, on the nodes screen, I see that 2 nodes are with 0% CPU usage, and the other nodes are 10-15%.
It looks strange to me that the query is taking so much time, and the machines doesn't work so hard.
I guess i could add a 5th machine, and i'll get better performance, but i think that with the current hardware I can get also a better search results (a single term query is 10 seconds length)

Am I doing something wrong? Or is it an Elastic limitation?

Are you saying a single term query, without the aggregation, is 10 seconds? Or a single term query including the aggregation is 10 seconds?

Can you share the query you're sending? Or one that recreates the problem?

Also how much RAM is going to JVM? Is enough being left for the OS's file system cache?

Hey,

each machine has 30GB RAM for ES_HEAP_SIZE, and the rest is left for the OS

an example for a query without aggregation:

GET records/_search
{
"size": 1000
, "query": {
"filtered": {
"filter": {
"term": {
"phoneNumber": "05801001590"
}
}
}
}
}

phone number is type string, this query took 6000 ms

This is the 2 aggregations query:

GET records/_search
{
"aggs": {
"by_id": {
"terms": {
"field": "entityId",
"size": 5,
"order": {
"_count": "desc"
}
},
"aggs": {
"by_name": {
"terms": {
"field": "name",
"size": 1
}
}
}
}
}
}

this query took 30000 ms

each record has an ID, Name, Phone and some other fields (this is version 2.3.1 so by default all fields are default doc_value)