Flash to disk

roiwexler · May 16, 2019, 8:08am

Hi,
at ES documentations is says - By default, Elasticsearch uses heuristics in order to automatically trigger flushes as required.

does anyone knows when ES triggers flush to disk and if there is a way to configure it..

we saw that when running a heavy indexing process, and the RAM is getting full, flash to disk happens every few seconds, and that slows our indexing speed..

with Solr you have the ability to control when a this flash runs (called hard commit)
any ideas what can we do to control it?

Bernt_Rostad · May 16, 2019, 8:25am

Elasticsearch allows you to specify the refresh_interval for an index, setting this to -1 is something I do before bulk indexing or reindexing to speed things up:

PUT /my_index/_settings
{
    "index" : {
        "refresh_interval" : "-1"
    }
}

Then when I'm done indexing I set "refresh_interval" : "null" which resets the refresh interval to its default value (1 second).

If you need to force a flush on an index, to persist it to disc, use the Flush API.

roiwexler · May 20, 2019, 12:30pm

Hi, we know about refresh_interval and setting it to -1 is the default configuration for us in tests. the scenario I specified above is with this set already.
this affect the

but coming back to my original question, do you know when ES triggers flush to disk and how can we control it?

Bernt_Rostad · May 20, 2019, 12:49pm

No, I'm sorry. I only know what the documentation says, that ES uses heuristics to trigger flushes and that the user can trigger this manually in order to reduce the recovery time when restarting nodes.

But are you sure flushing is the cause for the slowness you experience every few seconds? If you've turned off refreshing it could still be the garbage collector kicking in. Perhaps you could check that?

I regularly re-index hundreds of millions of documents and terabytes of data and the Reindex API uses bulk indexing underneath, but I've never experienced the problem you mention. All I ever do is turn off the index refreshing and shard replication, after that it's all smooth sailing.

I hope you can figure it out or that someone else, who knows more about the flush mechanism, can help you.

Christian_Dahlqvist · May 20, 2019, 2:07pm

What type of storage/dusk do you have?

system · June 17, 2019, 2:07pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Refresh is very slow Elasticsearch	7	2977	June 19, 2018
Does manual refresh flush indices too? Elasticsearch	4	469	July 6, 2017
Refresh time/latency Elasticsearch	2	1659	July 5, 2017
Index then query cost 20 minutes more with half billion rows Elasticsearch	2	336	July 6, 2017
Search hits number jumps up and down when updates documents Elasticsearch	7	433	July 6, 2017

Flash to disk

Related topics