Elasticsearch - Disk writes increasing over time

Hi all!

First time post here - I'm fairly new to ES so please be gentle with me :smile: - I am seeing some strange behaviour related to the amount ES is writing to disk, and would appreciate any suggestions as to what might be occurring here.

Here are the facts:

  • I am running ES 1.7.3 on a CentOS 7 VM
  • The VM has 2 cores, and 8GB memory
  • ES has a 3GB heapsize
  • I am monitoring 7 servers with the ELK stack (mostly collectd metrics)
  • The rate of documents being stored stays constant over time (Yesterday, ~700k documents were written, totalling 90MB).
  • Looking at the collectd metrics for the ES server itself, the amount of data being written to disk is increasing over time, before dropping to nearly zero again (this seems to have a 12 hour cycle: 00:00-12:00 and 12:00-00:00)
  • I am graphing this in Kibana using the collectd "write" value and a derivative (Kibana 4.1.2).
  • I have performed additional verification using a vmstat script, which gives a pure write/sec metric (i.e. the same pattern is observed when graphing the vmstat output without needing a derivative)
  • Using iotop I can see that Elasticsearch is responsible for virtually all writes.
  • To give an idea of the amount of data being written - at 00:00, virtually no bytes per second are being written. This increases gradually until 12:00, when around 2MB/s is being written - given that 90MB of documents were stored in the entire 24 hour period yesterday, this is massive!

Hopefully this all makes sense - please let me know any additional info which is required in order to take this further - I'm a little confused right now as an ES installation used by a different team here (on CentOS 6, with a very similar out-of-the-box configuration), and with a lot more data being collected (~1GB per day), is not exhibiting the same behaviour!

Thanks in advance!

If you are updating indices with newly indexed then there are things like merges that happen under the hood.

Is this causing a problem or are you just interested?

Hi Mark,

Thanks for your reply - it's not causing a problem right now, however I am trying to understand f it will become a problem for us later when the volume of data increases.

I'll be using ELK to pull in real-time results from performance tests, therefore a lot of documents (potentially 1m+ transactions per hour) will be stored when tests are running - what I don't want is my real-time monitoring to perform badly under load!

(I'm aware that I may need to scale my ES cluster to support this sort of volume)