I'm load testing a reporting service that will allow users to run
arbitrarily-generated facets against a large set of data (~100m new records
per day, using daily index rotation). We have a 3-node ES cluster, with
72GB ram each (half allocated to ES, half reserved for the OS and disk
cache). We're running ES 0.18.7.
I'm seeing regular out of memory errors that result in a particular node
locking up and dropping out of the cluster. Flushing the field cache or
fully restarting the node brings everything back into a good state, but I'd
rather avoid getting into this state in the first place. I tried setting
index.cache.field.type: soft in /etc/elasticsearch.yml, but this isn't
having the desired effect - is this not where one sets this? Do I need to
set it per-index using the index settings api?
I've read elsewhere that regularly evicting entries from the cache is
undesirable, but I don't see an alternative here - our data set is always
going to be too large to fit in memory, and I'm willing to accept the
performance trade-off since most of our faceted queries are relatively
unique and I'd rather be able to query slowly than worry about crashing ES
nodes by running out of memory.
I'd rather not write a cron job to flush the caches when memory usage
creeps up. It definitely sounds like there's a better way, though, and I'm
just missing something semi-obvious.