I am using elasticsearch 6.1.1 and creating a single shard per index. The cluster settings is as follows -
    {
    "persistent": {
        "cluster": {
            "routing": {
                "allocation": {
                    "enable": "all"
                }
            }
        },
        "indices": {
            "breaker": {
                "request": {
                    "limit": "80%"
                }
            }
        }
    },
    "transient": {}
}
System RAM : 16 GB
JVM heap size : 4 GB
The reason of using one shard per index is to avoid data approximation performed by elasticsearch on specific cases.
I have an index 'atcc_summary_201707_5' containing nearly 0.14 million documents(having some nested fields) of size 250 mb. I am trying to run an aggregation query over a subset of those those documents. The query involves nested bucketing(up to 3 levels) and some metric aggregations. Every time I'm running the query, it is throwing circuit_breaking_exception with the following reason -
"[parent] Data too large, data for [<agg [count_2wh]>] would be [2982073327/2.7gb], which is larger than the limit of [2982071500/2.7gb]"
I'm literally stuck here as the whole point of using elasticsearch was to be able to query on a huge data set quickly. Please throw some light on why it is consuming so much of memory. Is it because of having one-off shard per index ? Please advise how to get around this.
 Extremely sorry for this. Can you please have a look into this link