YAOOMP (yet another out-of-memory post) - kibana's histogram

working in a testing environment currently with about 3.11TB of data,
spread out over 128 indices (one index per day for four months, each index
is about 12-18GB). Currently using 3 master-only nodes, and 6 data-only
nodes. Each index has 5 primary shards and 1 set of replicas for a total of
10 shards.

I'm playing with a lot of settings, trying to tweak GC collection etc. but
I'm seeing pretty consistently that when loading data in Kibana, I can load
up all four months worth of data if i hide any objects that visualize data
(lists, summaries, etc. all seem to load fine, albeit slowly). but once i
un-hide the histogram, i see lots of 500 errors in the calls to the node,
and occasionally (once, so far) the entire cluster will stop responding and
i'll have to force kill java processes and restart elasticsearch. i've also
attempted to only load the histogram object, and even for smaller chunks of
data... but the behavior remains the same. if i have successfully loaded
the heap up on the data nodes, the calls to draw the histogram will mostly
fail unless the heap was almost completely empty. also, if i don't hide the
histogram, and have several objects on my kibana dashboard loading, the
histogram will fail and domino effect all other objects on the page
complaining about failing to load global facets.

it seems like with all the other objects on the kibana page, the query
calls are able to handle heap memory more successfully... i see it
fluctuate up and down and it seems like it's working fine. but the
histogram will end up loading lots of data into heap and nothing seems to
clear it out. is there anything that i can do to tweak this behavior?

i'm including the curl object that kibana generates for a couple things on
the page, please call out anything that you think is suspicious, or ways to
tweak the servers to handle these kinds of queries more effectively.

Histogram object:

curl -XGET
'http://10.6.13.25:9200/accesslogging_awseast-2014-09-30,accesslogging_awseast-2014-09-29,accesslogging_awseast-2014-09-28,accesslogging_awseast-2014-09-27,accesslogging_awseast-2014-09-26,accesslogging_awseast-2014-09-25,accesslogging_awseast-2014-09-24,accesslogging_awseast-2014-09-23,accesslogging_awseast-2014-09-22,accesslogging_awseast-2014-09-21,accesslogging_awseast-2014-09-20,accesslogging_awseast-2014-09-19,accesslogging_awseast-2014-09-18,accesslogging_awseast-2014-09-17,accesslogging_awseast-2014-09-16,accesslogging_awseast-2014-09-15,accesslogging_awseast-2014-09-14,accesslogging_awseast-2014-09-13,accesslogging_awseast-2014-09-12,accesslogging_awseast-2014-09-11,accesslogging_awseast-2014-09-10,accesslogging_awseast-2014-09-09,accesslogging_awseast-2014-09-08,accesslogging_awseast-2014-09-07,accesslogging_awseast-2014-09-06,accesslogging_awseast-2014-09-05,accesslogging_awseast-2014-09-04,accesslogging_awseast-2014-09-03,accesslogging_awseast-2014-09-02,accesslogging_awseast-2014-09-01,accesslogging_awseast-2014-08-31,accesslogging_awseast-2014-08-30,accesslogging_awseast-2014-08-29,accesslogging_awseast-2014-08-28,accesslogging_awseast-2014-08-27,accesslogging_awseast-2014-08-26,accesslogging_awseast-2014-08-25,accesslogging_awseast-2014-08-24,accesslogging_awseast-2014-08-23,accesslogging_awseast-2014-08-22,accesslogging_awseast-2014-08-21,accesslogging_awseast-2014-08-20,accesslogging_awseast-2014-08-19,accesslogging_awseast-2014-08-18,accesslogging_awseast-2014-08-17,accesslogging_awseast-2014-08-16,accesslogging_awseast-2014-08-15,accesslogging_awseast-2014-08-14,accesslogging_awseast-2014-08-13,accesslogging_awseast-2014-08-12,accesslogging_awseast-2014-08-11,accesslogging_awseast-2014-08-10,accesslogging_awseast-2014-08-09,accesslogging_awseast-2014-08-08,accesslogging_awseast-2014-08-07,accesslogging_awseast-2014-08-06,accesslogging_awseast-2014-08-05,accesslogging_awseast-2014-08-04,accesslogging_awseast-2014-08-03,accesslogging_awseast-2014-08-02,accesslogging_awseast-2014-08-01,accesslogging_awseast-2014-07-31,accesslogging_awseast-2014-07-30,accesslogging_awseast-2014-07-29,accesslogging_awseast-2014-07-28,accesslogging_awseast-2014-07-27,accesslogging_awseast-2014-07-26,accesslogging_awseast-2014-07-25,accesslogging_awseast-2014-07-24,accesslogging_awseast-2014-07-23,accesslogging_awseast-2014-07-22,accesslogging_awseast-2014-07-21,accesslogging_awseast-2014-07-20,accesslogging_awseast-2014-07-19,accesslogging_awseast-2014-07-18,accesslogging_awseast-2014-07-17,accesslogging_awseast-2014-07-16,accesslogging_awseast-2014-07-15,accesslogging_awseast-2014-07-14,accesslogging_awseast-2014-07-13,accesslogging_awseast-2014-07-12,accesslogging_awseast-2014-07-11,accesslogging_awseast-2014-07-10,accesslogging_awseast-2014-07-09,accesslogging_awseast-2014-07-08,accesslogging_awseast-2014-07-07,accesslogging_awseast-2014-07-05,accesslogging_awseast-2014-07-04,accesslogging_awseast-2014-07-03,accesslogging_awseast-2014-07-02,accesslogging_awseast-2014-07-01,accesslogging_awseast-2014-06-30,accesslogging_awseast-2014-06-29,accesslogging_awseast-2014-06-28,accesslogging_awseast-2014-06-27,accesslogging_awseast-2014-06-26,accesslogging_awseast-2014-06-25,accesslogging_awseast-2014-06-24,accesslogging_awseast-2014-06-23,accesslogging_awseast-2014-06-22,accesslogging_awseast-2014-06-21,accesslogging_awseast-2014-06-20,accesslogging_awseast-2014-06-19,accesslogging_awseast-2014-06-18,accesslogging_awseast-2014-06-17,accesslogging_awseast-2014-06-16,accesslogging_awseast-2014-06-15,accesslogging_awseast-2014-06-14,accesslogging_awseast-2014-06-13,accesslogging_awseast-2014-06-12,accesslogging_awseast-2014-06-11,accesslogging_awseast-2014-06-10,accesslogging_awseast-2014-06-09,accesslogging_awseast-2014-06-08,accesslogging_awseast-2014-06-07,accesslogging_awseast-2014-06-06,accesslogging_awseast-2014-06-05,accesslogging_awseast-2014-06-04,accesslogging_awseast-2014-06-03,accesslogging_awseast-2014-06-02,accesslogging_awseast-2014-06-01/_search?pretty'
-d '{
"facets": {
"0": {
"date_histogram": {
"field": "timestamp",
"interval": "1d"
},
"global": true,
"facet_filter": {
"fquery": {
"query": {
"filtered": {
"query": {
"query_string": {
"query": "-uri:"*javascript:void(0)/""
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"from": 1401657333127,
"to": 1412115333128
}
}
},
{
"terms": {
"website": [
"www.mustangandfords.com"
]
}
},
{
"terms": {
"status": [
404
]
}
}
],
"must_not": [
{
"terms": {
"user_agent.typePhrase": [
""Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0)
Gecko/20130406 Firefox/23.0""
]
}
}
]
}
}
}
}
}
}
}
},
"size": 0
}'

Data Fields:

curl -XGET
'http://10.6.13.25:9200/accesslogging_awseast-2014-09-30,accesslogging_awseast-2014-09-29,accesslogging_awseast-2014-09-28,accesslogging_awseast-2014-09-27,accesslogging_awseast-2014-09-26,accesslogging_awseast-2014-09-25,accesslogging_awseast-2014-09-24,accesslogging_awseast-2014-09-23,accesslogging_awseast-2014-09-22,accesslogging_awseast-2014-09-21,accesslogging_awseast-2014-09-20,accesslogging_awseast-2014-09-19,accesslogging_awseast-2014-09-18,accesslogging_awseast-2014-09-17,accesslogging_awseast-2014-09-16,accesslogging_awseast-2014-09-15,accesslogging_awseast-2014-09-14,accesslogging_awseast-2014-09-13,accesslogging_awseast-2014-09-12,accesslogging_awseast-2014-09-11,accesslogging_awseast-2014-09-10,accesslogging_awseast-2014-09-09,accesslogging_awseast-2014-09-08,accesslogging_awseast-2014-09-07,accesslogging_awseast-2014-09-06,accesslogging_awseast-2014-09-05,accesslogging_awseast-2014-09-04,accesslogging_awseast-2014-09-03,accesslogging_awseast-2014-09-02,accesslogging_awseast-2014-09-01,accesslogging_awseast-2014-08-31,accesslogging_awseast-2014-08-30,accesslogging_awseast-2014-08-29,accesslogging_awseast-2014-08-28,accesslogging_awseast-2014-08-27,accesslogging_awseast-2014-08-26,accesslogging_awseast-2014-08-25,accesslogging_awseast-2014-08-24,accesslogging_awseast-2014-08-23,accesslogging_awseast-2014-08-22,accesslogging_awseast-2014-08-21,accesslogging_awseast-2014-08-20,accesslogging_awseast-2014-08-19,accesslogging_awseast-2014-08-18,accesslogging_awseast-2014-08-17,accesslogging_awseast-2014-08-16,accesslogging_awseast-2014-08-15,accesslogging_awseast-2014-08-14,accesslogging_awseast-2014-08-13,accesslogging_awseast-2014-08-12,accesslogging_awseast-2014-08-11,accesslogging_awseast-2014-08-10,accesslogging_awseast-2014-08-09,accesslogging_awseast-2014-08-08,accesslogging_awseast-2014-08-07,accesslogging_awseast-2014-08-06,accesslogging_awseast-2014-08-05,accesslogging_awseast-2014-08-04,accesslogging_awseast-2014-08-03,accesslogging_awseast-2014-08-02,accesslogging_awseast-2014-08-01,accesslogging_awseast-2014-07-31,accesslogging_awseast-2014-07-30,accesslogging_awseast-2014-07-29,accesslogging_awseast-2014-07-28,accesslogging_awseast-2014-07-27,accesslogging_awseast-2014-07-26,accesslogging_awseast-2014-07-25,accesslogging_awseast-2014-07-24,accesslogging_awseast-2014-07-23,accesslogging_awseast-2014-07-22,accesslogging_awseast-2014-07-21,accesslogging_awseast-2014-07-20,accesslogging_awseast-2014-07-19,accesslogging_awseast-2014-07-18,accesslogging_awseast-2014-07-17,accesslogging_awseast-2014-07-16,accesslogging_awseast-2014-07-15,accesslogging_awseast-2014-07-14,accesslogging_awseast-2014-07-13,accesslogging_awseast-2014-07-12,accesslogging_awseast-2014-07-11,accesslogging_awseast-2014-07-10,accesslogging_awseast-2014-07-09,accesslogging_awseast-2014-07-08,accesslogging_awseast-2014-07-07,accesslogging_awseast-2014-07-05,accesslogging_awseast-2014-07-04,accesslogging_awseast-2014-07-03,accesslogging_awseast-2014-07-02,accesslogging_awseast-2014-07-01,accesslogging_awseast-2014-06-30,accesslogging_awseast-2014-06-29,accesslogging_awseast-2014-06-28,accesslogging_awseast-2014-06-27,accesslogging_awseast-2014-06-26,accesslogging_awseast-2014-06-25,accesslogging_awseast-2014-06-24,accesslogging_awseast-2014-06-23,accesslogging_awseast-2014-06-22,accesslogging_awseast-2014-06-21,accesslogging_awseast-2014-06-20,accesslogging_awseast-2014-06-19,accesslogging_awseast-2014-06-18,accesslogging_awseast-2014-06-17,accesslogging_awseast-2014-06-16,accesslogging_awseast-2014-06-15,accesslogging_awseast-2014-06-14,accesslogging_awseast-2014-06-13,accesslogging_awseast-2014-06-12,accesslogging_awseast-2014-06-11,accesslogging_awseast-2014-06-10,accesslogging_awseast-2014-06-09,accesslogging_awseast-2014-06-08,accesslogging_awseast-2014-06-07,accesslogging_awseast-2014-06-06,accesslogging_awseast-2014-06-05,accesslogging_awseast-2014-06-04,accesslogging_awseast-2014-06-03,accesslogging_awseast-2014-06-02,accesslogging_awseast-2014-06-01/_search?pretty'
-d '{
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "-uri:"*javascript:void(0)/""
}
}
]
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"from": 1401657333127,
"to": 1412115333128
}
}
},
{
"terms": {
"website": [
"www.mustangandfords.com"
]
}
},
{
"terms": {
"status": [
404
]
}
}
],
"must_not": [
{
"terms": {
"user_agent.typePhrase": [
""Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0)
Gecko/20130406 Firefox/23.0""
]
}
}
]
}
}
}
},
"highlight": {
"fields": {},
"fragment_size": 2147483647,
"pre_tags": [
"@start-highlight@"
],
"post_tags": [
"@end-highlight@"
]
},
"size": 500,
"sort": [
{
"_score": {
"order": "desc",
"ignore_unmapped": true
}
}
]
}'

Thanks in advance for anyone who can shed some light on this topic! and if
you require more information, please let me know... i've tried to include
all the info i can think of but i'm sure i left something out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/439c3e15-0a70-474e-a642-3a06a6b8b7c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

as a followup, it seems like i can load the histogram without trouble if
heap is relatively empty, and then crash my cluster by trying to load any
other data on the page (the reverse of what i described above). so these
datasets are not good at pushing the last one out of heap, seems to me.

running ES 1.2.1, btw.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/22f1db70-a07a-4e96-bcf7-ac105b67dbb5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.