Initializing_shards - second db start up takes long time

marcin_es · August 31, 2015, 1:55pm

Hello all,
I am working on a solution that uses embedded elasticsearch server - on one local machine. The scenario is:
1)create cluster with one node. Import data - 3 million records in ~180 indexes and 911 shards. Data is available, search works and returns expected data:
{
"cluster_name" : "cn1441023806894",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 911,
"active_shards" : 911,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
}

Now, I shutdown the server - this is my console output:
sie 31, 2015 2:51:36 PM org.elasticsearch.node.internal.InternalNode stop
INFO: [testbg] stopping ...
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode stop
INFO: [testbg] stopped
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode close
INFO: [testbg] closing ...
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode close
INFO: [testbg] closed

The database folder is around 2.4 GB.

Now i start the server again.... and it takes around 10 minutes to reach status green, example health:
{
"cluster_name" : "cn1441023806894",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 68,
"active_shards" : 68,
"relocating_shards" : 0,
"initializing_shards" : 25,
"unassigned_shards" : 818
}

After that process, the database folder is ~0.8 GB.

Then I shutdown the database, and open it again, and now it gets green in 10 seconds.

My configuration:
settings.put(SET_NODE_NAME, projectNameLC);
settings.put(SET_PATH_DATA, projectLocation + "\" + CommonConstants.ANALYZER_DB_FOLDER);
settings.put(SET_CLUSTER_NAME, clusterName);
settings.put(SET_NODE_DATA, true);
settings.put(SET_NODE_LOCAL, true);
settings.put(SET_INDEX_REFRESH_INTERVAL, "-1");
settings.put(SET_INDEX_MERGE_ASYNC, true);
//the following settings are my attempt to speed up loading on the 2nd startup
settings.put("cluster.routing.allocation.disk.threshold_enabled", false);
settings.put("index.number_of_replicas", 0);
settings.put("cluster.routing.allocation.disk.include_relocations", false);
settings.put("cluster.routing.allocation.node_initial_primaries_recoveries", 25);
settings.put("cluster.routing.allocation.node_concurrent_recoveries", 8);
settings.put("indices.recovery.concurrent_streams", 6);
settings.put("indices.recovery.concurrent_streams", 6);
settings.put("indices.recovery.concurrent_small_file_streams", 4);

The questions:

What happens during the second start up? The db folder size reduces from 2.4gb into 800 megabytes.
2)If this process is necessary, can it be trigerred manually, so I can show nice "please wait" dialog?

The user experience on teh second database opening is very bad and I need to change it.

Cheers
Marcin

msimos · August 31, 2015, 6:25pm

What version of Elasticsearch are you using? Before shutting down you may want to try issuing a synced flush:

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-synced-flush.html

Its only available on Elasticsearch 1.6.0 or later. This may speed up the start up after shutdown.

marcin_es · September 1, 2015, 7:58am

Hi Mike,
I was on 1.4, upgraded to 1.7 and now after I finished import to particular index I call the synced flush... and it did the trick!
I call:

client.admin().indices().flush(new FlushRequest(idxName));

Thanks for your help!

Topic		Replies	Views
One node cluster stuck on initializing_shards Elasticsearch	4	7505	June 29, 2017
Slow initialisation time after restart Elasticsearch	11	2093	June 1, 2017
Shards Initializing Indefinitely? Elasticsearch	10	5004	October 24, 2017
Shards stuck in initializing status for long time Elasticsearch	4	979	July 6, 2017
Restarting node takes time Elasticsearch	4	1079	July 5, 2017

Initializing_shards - second db start up takes long time

Related topics