It's always a problem to set a number of shards for an index.
We know how much Data we get per Month (about 150 GB) and since a lot of people post a shard shouldn't be larger than 50 GB it would fit into 3 shards.
If we make 5 shards per index and 1 replica (= 10 shards) , it would perfect fit with our 10 data Nodes.
Now the idea:
We ask the current index if a criterion is reached (e.g. 50 GB per Shard in the index) and create a new index.
We would set a filtered alias for a user and with this alias we can seperate the searching against all the indices.
The other question with this idea: is there a difference between setting 1 shard per index or e.g. 5 shards?
Or is there a disadvantage to create many indices with one shard towards
a fifth number of all indices with 5 shards each?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.