Memory usage constantly increasing?

Hi,

we use elasticsearch for the classic ELK stack. We're indexing on average about 18 million docs (log file entries) per day. ES is doing a fantastic job, but I was wondering about memory usage. It seems it is constantly increasing. We've only run this stack for about a month, so haven't expired (removed) any old data yet.

We're at about 600M docs now, 375G index size. Query speed is not important for us. ES is running at 5G heap size, constantly increasing (upped from initially 3G). Here are some graphs. It seems quite clear that memory usage is proportional to document count. Now I have read a lot about memory usage, and lots of information out there is for old versions of ES, but I've read about the "circuit breakers" and I had the impression that somehow it's possible to limit ES memory usage and force it to read data from disk if RAM becomes low.

I don't know if I need to change any indexing params. We have unique strings for session IDs, so those will definitely keep piling up. Are there any definite guides to make ES limited in memory usage if I don't care about query speed? Or is it unavoidable?

Here are the graphs (starting halfway through us starting to use ELK): http://imgur.com/a/6HaRZ

Thanks for all hints,

Stefan

Which version of Elasticsearch are you using? How many indices and shards do you have in the cluster?

ES 2.2.1 from the APT repository. We have just one VM running ES, one index per day, so that makes about 30 indices holding about 99% of the data, and some other (Windows and Linux log data) totalling to about 290 indices. Shards - I think it's using the default setting of 5 shards per index?

Each shard has some overhead in terms of memory usage and file handles associated with it. Close to 1500 shards is a lot on a node with that amount of heap, so I would recommend switching to use a single shard per index and/or possibly consider using weekly or monthly indices, at least for the types of data with very small volumes. Looking to have an average shard size between a few GB to a few tens of GB in size is not unreasonable for a lot of use cases involving time based indices.

Thanks for the tips. I have changed the default number of shards to 1 and consolidated older smaller indices into month-based indices, thus bringing the total number of shards down to about 500. It has made a small dent in the memory usage, maybe about 500MB (after a manual full GC). The biggest item that ES itself reports (terms_memory_in_bytes) is unchanged.

If you really want to be sure you can use concurrent testing to determine if a memory leak is present. I have done so with JMeter. Contact me if you would to have the script.