Can I print out bitset cache sizes by filter (or what's eating my heap)?

After starting to make liberal use of nested properties to simulate (denormalized) many-to-many relations, our ES server started routinely throwing OutOfMemory errors several hours after startup under routine, moderate usages (both upserts and queries). Our app issues a fair number of nested filter queries, with inner term filters.

Trying to track down the root cause; wonder if bitsets of nested / term filters are being cached and never evicted, causing eventual OOM. With the above in mind a few questions:

  1. can I print out bitset memory alloc by filter? (else - profiling time)
  2. are cached filter bitsets ever evicted? on what basis?

thanks,

-nikita

Since using nested documents, how big are your document sizes now? I'd recommend taking a look at:

GET /_nodes/stats?human

Look at segments, field data memory & filter_cache. The filter cache by default is set to 10% by default and will evict the least recently used data:

https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-cache.html#filter

  1. No, the bitset filter cache caches bitsets per nested field. (only nested fields that have a parent nested field gets cached) The stats api just expose what the entire cache is taking.
  2. If segments are removed by Lucene any cache entry associated with it is removed too. This also applies for the bitset filter cache. But other then that there is no other mechanism that purges the bitset filter cache.

But other then that there is no other mechanism that purges the bitset filter cache.

@mvg, thanks for your reply. However, this doc says (node) filter cache employs an LRU eviction policy, so does get purged, no? Or am I misunderstanding something?

just to close this out:

  • i came to the conclusion that ES_HEAP_SIZE was just too small given our index size, so we increase it and are not seeing OOMs any longer. (1.2G -> 3G)
  • index doc count: 13M
  • index store size at rest: 1.5GB; periodically goes up to 2.9G, presumably due to retained deleted docs
  • filter cache size: ~ 26M

Thanks to all for help. Nice work on ES and kudos for UX of Kibana/marvel

-nikita

Hi, Martin, I have interest about knowing what you've writed. That only nested fields that have a parent nested field gets cached. There's another source where I can find this information. I'm studying ElasticSerach on my master's degree and I think I will need another source for this information, different from this forum. Can you help me? I have looked at the ES definitive guide and I didnt find it there.