I have 3 master and 2 data node with each 1TB cluster.
I build a production index and it is reaching limit of 2B, how to increase # of shards to 10 on running index.
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.21.17.8 81 99 52 1.92 1.52 1.42 dil - elk-data-vm3
10.21.17.5 26 66 0 0.00 0.01 0.00 ilm - elk-master-vm2
10.21.17.4 69 99 18 1.26 1.00 0.98 dil - elk-data-vm4
10.21.17.6 25 48 0 0.00 0.00 0.00 ilm - elk-master-vm0
10.21.17.7 35 64 1 0.00 0.02 0.00 ilm * elk-master-vm1
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1593650988 00:49:48 aiopselkcluster red 5 2 71 36 0 0 19 0 - 78.9%
yellow open insight_tapm_prod_v1 dEEH71UwQe6enExvoKkpCw 1 2 1912941427 0 156gb 78gb
7.6.2
inserting new records every 5 min as we process data with ML models.
Quick help needed is -- to increase shards on existing index, and i tried -- make index and read only and increase shards.
Would like to know if i am making any mistake on above steps.
With this changing shard size you can. Just kidding.
You will have to split your index into a new one with more shards. First of all you will have to change the number_of_routing_shards setting to something like 25. After that set the index to read only and perform the split.
POST /index/_split/split-index
{
"settings": {
"index.number_of_shards": 20
}
}
Yes, looks right to me. But be aware that you can't index anything into the index if you set it into read only (as the name suggests). You will have a downtime of a day or more if you havent set up counter measures. Spliting the index will take its time, becuase your index is huge.
oops good point, i can't do this real-time. thanks for pointing out.
What is recommended action -
option1: Create new index with more shards and push data there
option2: downtime to create splits
Create a new index with more shards and the right mapping.
Index new data into this new index.
_reindexthe old index into the new one.
I am not 100% sure this works so I would wait for @Christian_Dahlqvist, @dadoonet or another Elasticsearch Team member to verify these steps. You don't want to riskt a data loss.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.