How many shards can help deliver the best performance?

Hi All,
Here is my environmental
CPU 8core
Memory 16G
OS centOS 7
elasticsearch 2.1 one-node

I used the same query in the following conditions:
40W data in one shard,take 3 seconds for searching.
90W data in 5 shards ,take 3 seconds for searching
90W data in10 shards ,take 3 seconds for searching

I also try 20 shards.
With the increase in the number of shards, the searching time remain unchanged.

This is my query .

curl -XPOST "127.0.0.1:9200/sflow_1452009600/sflow/_search?pretty" -d '
{
 "size":0,
    "query": {
        "filtered":{
            "filter":{
                "range": { "@timestamp": {"gte":1450256300,"lte":1464258400}}
            }
        }
    },
  "aggs":
    {
      "SRC_DST":
   {
          "terms": {"script": "[doc.SRC_IP.value, doc.DST_IP.value].join(\"-\")","size": 2,"shard_size":0, "order": {"sum_bits": "desc"}},
          "aggs": { "sum_bits": { "sum": {"field": "BYTES"} } }
        }
    }
}'

Have anyone do the testing in the same configuration ?
Is this normal or do I need to change Something.

Thanks,
Judy

What's the W meant to represent?

Also given you have a script in your query, it's not going to be super fast.

40W means 400000