I'm trying to transition from a one-index-per-user setup (which was killing me because of RAM usage) to a many-users-per-index setup. So routing is something I want to make sure I understand ... this article is very helpful, except it's explicitly warning not to trust it!!
I studied the routing concept in the above mentioned link. But my need is totally different. I have three type of logs like Prepaid, Postpaid & CDMA. The traffic is little bit high (1200 per second from each). I mostly search based on the type. For that I neeeded routing. But I cant allocate only one shard per type as like in the manual. So I want to allocate 5 shards for postpaid and route all the postpaid traffic to the 5 shards and so on for others also. The routing is based on the path /home/ES/POSTPAID/SMSCDR_POSTPAID_160518010000_10.80.41.70_RS1.log /home/ES/CDMA/SMSCDR_DEL_ATTEMPT_160518000000_10.80.41.68_RS3_1.log /home/ES/PREPAID/SMSCDR_PREPAID_160518220000_10.80.41.88_RS10.log
Allocate 5 shards for one type.
Note: I cant maintain three indices for these three. I have indices based on the operator.
I don't believe you have another choice than building one index per type prepaid, postpaidand cdma.
On the opposite you can totally send all documents for the same operator within the same shard using routing.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.