I am trying to roll out a system for storing logs (similar to Logstash and
Graylog), where the amount of indexed log data has no upper bounds. In my
experiments so far, I have observed that with time, as I index more and
more log data (no searching yet), the total amount of open file
descriptors, and memory usage gradually grows, until it hits the limits and
some error occurs.
I have been experimenting on a fixed number of nodes (3 nodes, 8GB for
elasticseach each) with index rotation, different shard numbers, segment
sizes and merging schemes. I saw impact on resource usage, but the general
tendency of constant growing of the number of FDs and memory stayed the
same.
Is there any way to make elasticsearch release excess FDs and memory, in a
similar fashion to an LRU cache, even if it comes at the expense of poorer
performance?
Both Logstash and Graylog simply suggest that you estimate the required
resources for the given amount of data and delete any excess (by deleting
old date-rotated indices). I would like to avoid removing this old data,
but don't mind if the data is not cached and always loaded from disk
on-demand so that it does not hold on to any resources. I don't even mind
if the whole system becomes 10 times slower, as long as it doesn't throw an
OOM or "Too many open files".
Both Logstash and Graylog simply suggest that you estimate the required resources for the given amount of data and delete any excess (by deleting old date-rotated indices). I would like to avoid removing this old data, but don't mind if the data is not cached and always loaded from disk on-demand so that it does not hold on to any resources. I don't even mind if the whole system becomes 10 times slower, as long as it doesn't throw an OOM or "Too many open files".
Fitting a quart in a pint pot, eh?
I suspect you should probably look into closing old (rarely-used) indices. This means that they'll continue to take up disk space, but won't consume file handles etc.
Note that you can't read or write to a closed index, you have to open it again - your application will have to manage that side of things, opening an index before querying it. Clint warns that this process can take a few seconds to a couple of minutes, so you'll need to manage users' expectations. But - it should probably help you avoid OOMs or other badness.
Note that you can't read or write to a closed index, you have to open it
again ...
I have considered it, and I will definitely work for the indexing.
As for searching, it sounds a bit more tricky, especially in a multi-user
environment ... but, probably doable. Opening old index for searching may
potentially load most of the data into memory, depending on the search
query, so I'd only be able to open one (or limited) number of old indices
at a time, having to make other search clients wait. Well, I guess, that is
the price I have to pay sigh.
do your old-data-queries need to be realtime?
you can always redirect those old indices to separate servers, which take
their time loading the needed data, and then lazy load it / push to your
frontend (and then close these indices again if not used after a TTL, or if
other indices require loading... ie, separate your unbound long tail from
your current data search.
yes, you will need more resources than your 3 machines (maybe). but you
shouldnt need as much as keeping all in memory.
Note that you can't read or write to a closed index, you have to open it
again ...
I have considered it, and I will definitely work for the indexing.
As for searching, it sounds a bit more tricky, especially in a multi-user
environment ... but, probably doable. Opening old index for searching may
potentially load most of the data into memory, depending on the search
query, so I'd only be able to open one (or limited) number of old indices
at a time, having to make other search clients wait. Well, I guess, that is
the price I have to pay sigh.
you can always redirect those old indices to separate servers, which take
their time loading the needed data, and then lazy load it / push to your
frontend (and then close these indices again if not used after a TTL, or if
other indices require loading... ie, separate your unbound long tail from
your current data search.
No, realtimeness is not a requirement. I am not sure I understand how I
would set up those separate servers, and what do you mean by "redirect"? Do
you mean moving data for closed indices to a backup storage of some sort?
Thanks.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.