Hi,
Thanks for the comprehensive feedback. I'm attacking the problem this way because I've been frustrated in the other directions.
I'm running two nodes with ~32GB of RAM apiece, and the default heap breakdowns.
I've used doc_values where possible, although I'll certainly look into updating logstash. I'm using Amazon ElasticSearch (for now) which restricts my ability to choose a version there, and severely restricts what configuration I can do.
In this topic I lay out some of my difficulty with heap size; the tl;dr is that I'm running into a GC problem (I think) with the larger full-day indices. ie: The same number of documents (200M or so mostly apache access logs) in daily indices fails where hourly data is ok.
I'm aware of why the query queue is running out; that's why I'm trying to find middle ground. Apart from throwing more nodes at the problem, hourly logs yield 24x5x(days) shards. A Kibana dashboard with, say, 6 widgets is into thousands of queries right away. (As an aside: Kibana's really not helping by refreshing all the widgets simultaneously)
Luckily I'm only active on one or two indices, typically. And I'm managing by limiting the number of days of data I keep before archiving & deleting. Ideally, I'd like to keep the data longer, and feed more data into the indices each day. Right now I'm doing ~14-20M docs a day, and I wouldn't mind doubling that, but I've got to get through these growing pains, first.
Seems like each time I approach 150M-250M documents, it becomes very hard to convince ES to work well with my limitations.
So, to come full circle, with the 6-hour or 4-hour indices I hope to find a middle ground where I avoid the field data circuit breaker, but cut my shard count down by a factor of 4 or 6.
Thanks,
Jeff