Hi
Setting up 6.5.x cluster. So I need to index few million docs for this.
I have disabled index refresh during initial indexing. All indices have 5 shards.
Question: If doing inital indexing, without refresh, are these segments sized good enough..?
Or are there some thumb rules, that let's say:
- "single segment should not be greater than X"
- "or segment count per shard should not be greater than Y"
- "or are there some ratios.."
At the moment, it looks like this. Bear in mind, at the moment I am doing initial indexing to index named index7. Other indices are pretty much going to me same size as present time.
curl -sXGET `hostname`':9200/_cat/indices?v&s=health,status,index:desc'
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open index1 5 1 23907 0 2.9gb 1.4gb
green open index2 5 1 769 0 548.2kb 274kb
green open index3 5 1 1 0 394.3kb 197.1kb
green open index4 5 1 1259 0 1.4mb 760.3kb
green open index5 5 1 13372 0 13.5mb 6.7mb
green open index6 5 1 533808 0 105gb 52.5gb
green open index7 5 0 1953725 2374 3.3gb 3.3gb <-- currently indexing this, total would be 5gb per node, so with replica, it sums up to 10gb, refresh_interval = -1 at the moment
The default merge policy is as follows for all the indices:
"merge": {
"scheduler": {
"max_thread_count": "1",
"auto_throttle": "true",
"max_merge_count": "6"
},
"policy": {
"reclaim_deletes_weight": "2.0",
"floor_segment": "2mb",
"max_merge_at_once_explicit": "30",
"max_merge_at_once": "10",
"max_merged_segment": "5gb",
"expunge_deletes_allowed": "10.0",
"segments_per_tier": "10.0",
"deletes_pct_allowed": "33.0"
}
Should I change anything regarding some of these indices policies..?
What I mean is, some indices are going to be quite small compared to others, and probably default policy will not work for all the indices.
Regards
Raul