Open index management and segment memory exhaustion

I have a single node cluster on a beefy machine with a lot of memory. I give ES a 30GB heap. I'm dealing with a logging application and want to target around 100,000 EPS. My indices are time oriented, as in logfile-YYYY-MM-DD. Each log entry has a timestamp. Indices roll over daily, regardless of size. I'm using the default shard assignment of 5. Each index is around 800GB, although I expect to need about twice that in the future. I'm using ES 1.7.3. My mappings use doc_values whenever possible.

In my use case, scaling out is not an option. I am limited to this machine for this application.

I encountered a problem after indexing data for over two weeks. After that much data is in the cluster, larger aggregations and queries hit unacceptably long garbage collection pauses. I presume this sign of memory pressure indicates I've reached my node capacity. And in fact in most cases, the segment memory size is around 20GB at that point. (Is this referring to heap, or non-heap memory?)

If my diagnosis is correct, I'm faced with a new problem: how do I prevent this situation and leave as much recent data as possible available for search without prematurely closing old indices? I think this is the question: how many indices can I keep open?

To recap my query:

  • How does segment memory size relate to cluster memory pressure? Is this a good indicator for node capacity?
  • How can I best estimate how many indices I can keep open given known index sizes?

Thanks for any help you can offer.

Given you have a single node, you are heavily limited.
The biggest problem you are going to face is the per shard, lucene document limit of 2^31-1.

I'd suggest you upgrade to 2.1 as well, there are a bunch of improvements there.