I have a two node cluster hosted in ElasticCloud.
Host Elastic Cloud
Platform Google Cloud
Region US Central 1 (Iowa)
Memory 8 GB
Storage 192 GB
SSD Yes
HA Yes
Each node has:
Allocated Processors 2
Number of processors 2
Number of indices 4*
Shards (p/ index) 5*
Number of replicas 1
Number of document 150M
Allocated Disk 150GB
* the main indices, kibana and watcher creates a bunch of small indices.
My documents are mostly text. There are some other fields (no more than 5 per index), no nested objects. Indices specs:
| Index | Avg Doc Length | # Docs | Disk |
|---------|----------------|--------|------|
| index-1 | 300 | 80M | 70GB |
| index-2 | 500 | 5M | 5GB |
| index-3 | 3000 | 2M | 10GB |
| index-4 | 2500 | 18M | 54GB |
When system is idle, response time (load time) is typically few seconds. But when I simulate the behavior of 10 users I start to get timeouts in my application. Originally timeout was 10s, I updated it to 60s and I am still having issues. Here follows a chart for simulation of 10 concurrent users using Search Api.
Red line is total request time in seconds and dotted pink line is my 60 seconds timeout. So, I'd say in most of the times my users will experience a timeout. The query I've used is quite simple:
{
"size": 500,
"from": ${FROM},
"query":{
"query_string": {
"query": "good OR bad"
}
}
}
I've tried all possible tweaks that came to my knowledge. I don't know if that is the real ES performance and I have to accept it and upgrade my plan.