High CPU usage

I've set up a Kibana visualization starting from here:

I replaced the sample Apache web log data with a month's worth of my own live data and tweaked the logstash ingest process to work with a customized log format. I fired up Kibana and noticed that 30 seconds wasn't enough time for the visualization to load. So, I increased the timeout to 60 seconds and then the visualization usually loads but it takes maybe 45 seconds to do so.

Top shows me that the CPU is pegged at nearly 200% for the dual-core CPU during that load time.

I've run some curl commands to capture hot threads during the Kibana load but don't know what to look for in the output:

curl 'localhost:9200/_nodes/hot_threads' | grep "cpu usage"
59.6% (297.8ms out of 500ms) cpu usage by thread 'elasticsearch[Ringmaster][search][T#3]'
45.8% (228.8ms out of 500ms) cpu usage by thread 'elasticsearch[Ringmaster][search][T#4]'
45.5% (227.2ms out of 500ms) cpu usage by thread 'elasticsearch[Ringmaster][search][T#2]'

Nothing is logged in the ES log. The last time the load timed out, Kibana displayed this:

Error: Request Timeout after 60000ms

at http://.../bundles/kibana.bundle.js?v=9889:76425:16

And, this was logged by Kibana:

{"type":"response","@timestamp":"2016-06-28T21:38:02+00:00","tags":[],"pid":21585,"method":"post","statusCode":504,"req":{"url":"/elasticsearch/_msearch?timeout=0&ignore_unavailable=true&preference=1467149238306","method":"post","headers":{"host":"myhost:5601","accept":"application/json, text/plain, /","origin":"http://...","kbn-version":"4.5.0","user-agent":"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36","content-type":"application/json;charset=UTF-8","referer":"http://.../app/kibana","accept-encoding":"gzip, deflate","accept-language":"en-US,en;q=0.8","x-forwarded-for":"172.16.22.139","x-forwarded-host":"...","x-forwarded-server":"...","connection":"Keep-Alive","content-length":"5312"},"remoteAddress":"198.125.128.129","userAgent":"198.125.128.129","referer":"http://.../app/kibana"},"res":{"statusCode":504,"responseTime":60006,"contentLength":9},"message":"POST /elasticsearch/_msearch?timeout=0&ignore_unavailable=true&preference=1467149238306 504 60006ms - 9.0B"}

I've taken a handful of the queries that Kibana is running and they each run in about a second.

I would very much appreciate suggestions on how to debug this performance problem. Thanks.

I figured out how to log slow queries and these four consumed the most time, each more than 9 seconds.

[2016-06-30 14:32:36,271][TRACE][index.search.slowlog.query] took[9.5s], took_millis[9599], types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"filtered":{"query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":{"bool":{"must":[{"query":{"query_string":{"analyze_wildcard":true,"query":"*"}}},{"range":{"@timestamp":{"gte":1462127545921,"lte":1467311545921,"format":"epoch_millis"}}}],"must_not":[]}}}},"size":0,"aggs":{"2":{"geohash_grid":{"field":"geoip.location","precision":2}}}}], extra_source[], 

[2016-06-30 14:34:13,024][TRACE][index.search.slowlog.query] took[9.3s], took_millis[9333], types[], stats[],search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"filtered":{"query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":{"bool":{"must":[{"query":{"query_string":{"analyze_wildcard":true,"query":"*"}}},{"range":{"@timestamp":{"gte":1462127635928,"lte":1467311635928,"format":"epoch_millis"}}}],"must_not":[]}}}},"size":0,"aggs":{"2":{"cardinality":{"field":"geoip.ip.raw"}}}}], extra_source[], 

[2016-06-30 14:34:19,232][TRACE][index.search.slowlog.query] took[9.4s], took_millis[9436], types[], stats[],search_type[QUERY_THEN_FETCH], total_shards[5],source[{"query":{"filtered":{"query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":{"bool":{"must":[{"query":{"query_string":{"analyze_wildcard":true,"query":"*"}}}:{"range":{"@timestamp":{"gte":1462127635928,"lte":1467311635928,"format":"epoch_millis"}}}],"must_not":[]}}}},"size":0,"aggs":{"2":{"terms":{"field":"geoip.city_name.raw","size":10,"order":{"1":"desc"}},"aggs":{"1":{"cardinality":{"field":"clientip.raw"}}}}}}],extra_source[], [2016-06-30 14:34:20,721][TRACE][index.search.slowlog.query] took[9.1s], took_millis[9135], types[], stats[],search_type[QUERY_THEN_FETCH], total_shards[5],source[{"query":{"filtered":{"query":{"query_string":{"query":"*","analyze_wildcard":true}},"filter":{"bool":{"must":[{"query":{"query_string":{"analyze_wildcard":true,"query":"*"}}},{"range":{"@timestamp":{"gte":1462127635928,"lte":1467311635928,"format":"epoch_millis"}}}],"must_not":[]}}}},"size":0,"aggs":{"2":{"terms":{"field":"geoip.city_name.raw","size":10,"order":{"1":"desc"}},"aggs":{"1":{"cardinality":{"field":"clientip.raw"}}}}}}], extra_source[], 

These searches all have geop aggregation queries in common.

How can I approach optimizing these queries?

Also, I have 30 days of log data in one index. Would it matter if I loaded the 30 days of data into 30 indices? Each day has about 2 million records.

I should note that I have four of these cores:

Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz

Thanks.