Flash to disk

(Roi Wexler) #1

Hi,
at ES documentations is says - By default, Elasticsearch uses heuristics in order to automatically trigger flushes as required.

does anyone knows when ES triggers flush to disk and if there is a way to configure it..

we saw that when running a heavy indexing process, and the RAM is getting full, flash to disk happens every few seconds, and that slows our indexing speed..

with Solr you have the ability to control when a this flash runs (called hard commit)
any ideas what can we do to control it?

(Bernt Rostad) #2

Elasticsearch allows you to specify the refresh_interval for an index, setting this to -1 is something I do before bulk indexing or reindexing to speed things up:

PUT /my_index/_settings
{
    "index" : {
        "refresh_interval" : "-1"
    }
}

Then when I'm done indexing I set "refresh_interval" : "null" which resets the refresh interval to its default value (1 second).

If you need to force a flush on an index, to persist it to disc, use the Flush API.

(Roi Wexler) #3

Hi, we know about refresh_interval and setting it to -1 is the default configuration for us in tests. the scenario I specified above is with this set already.
this affect the

but coming back to my original question, do you know when ES triggers flush to disk and how can we control it?

(Bernt Rostad) #4

No, I'm sorry. I only know what the documentation says, that ES uses heuristics to trigger flushes and that the user can trigger this manually in order to reduce the recovery time when restarting nodes.

But are you sure flushing is the cause for the slowness you experience every few seconds? If you've turned off refreshing it could still be the garbage collector kicking in. Perhaps you could check that?

I regularly re-index hundreds of millions of documents and terabytes of data and the Reindex API uses bulk indexing underneath, but I've never experienced the problem you mention. All I ever do is turn off the index refreshing and shard replication, after that it's all smooth sailing.

I hope you can figure it out or that someone else, who knows more about the flush mechanism, can help you.

(Christian Dahlqvist) #5

What type of storage/dusk do you have?