Hello all,
I am working on a solution that uses embedded elasticsearch server - on one local machine. The scenario is:
1)create cluster with one node. Import data - 3 million records in ~180 indexes and 911 shards. Data is available, search works and returns expected data:
{
"cluster_name" : "cn1441023806894",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 911,
"active_shards" : 911,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
}
Now, I shutdown the server - this is my console output:
sie 31, 2015 2:51:36 PM org.elasticsearch.node.internal.InternalNode stop
INFO: [testbg] stopping ...
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode stop
INFO: [testbg] stopped
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode close
INFO: [testbg] closing ...
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode close
INFO: [testbg] closed
The database folder is around 2.4 GB.
Now i start the server again.... and it takes around 10 minutes to reach status green, example health:
{
"cluster_name" : "cn1441023806894",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 68,
"active_shards" : 68,
"relocating_shards" : 0,
"initializing_shards" : 25,
"unassigned_shards" : 818
}
After that process, the database folder is ~0.8 GB.
Then I shutdown the database, and open it again, and now it gets green in 10 seconds.
My configuration:
settings.put(SET_NODE_NAME, projectNameLC);
settings.put(SET_PATH_DATA, projectLocation + "\" + CommonConstants.ANALYZER_DB_FOLDER);
settings.put(SET_CLUSTER_NAME, clusterName);
settings.put(SET_NODE_DATA, true);
settings.put(SET_NODE_LOCAL, true);
settings.put(SET_INDEX_REFRESH_INTERVAL, "-1");
settings.put(SET_INDEX_MERGE_ASYNC, true);
//the following settings are my attempt to speed up loading on the 2nd startup
settings.put("cluster.routing.allocation.disk.threshold_enabled", false);
settings.put("index.number_of_replicas", 0);
settings.put("cluster.routing.allocation.disk.include_relocations", false);
settings.put("cluster.routing.allocation.node_initial_primaries_recoveries", 25);
settings.put("cluster.routing.allocation.node_concurrent_recoveries", 8);
settings.put("indices.recovery.concurrent_streams", 6);
settings.put("indices.recovery.concurrent_streams", 6);
settings.put("indices.recovery.concurrent_small_file_streams", 4);
The questions:
- What happens during the second start up? The db folder size reduces from 2.4gb into 800 megabytes.
2)If this process is necessary, can it be trigerred manually, so I can show nice "please wait" dialog?
The user experience on teh second database opening is very bad and I need to change it.
Cheers
Marcin