We have an indexing which is growing around 500 TB per week.
Currently, we have the size of 2 TB and have the 3 replicas, which is taking around 20-30 mins for indexing a 750 MB document. And lot of files to upload piled up and unable to catchup.
We have 10 node cluster (Windows Azure VMS)with 4 data, 3 master and 3 client. Data Nodes of size 56 GB RAM and 8 Cores.
What we really want to find out is, will be the daily,weekly, monthly indexes is the better option than a single huge index?
If have smaller indexes, will maintaining the indexes will be an issue in the longer period? If yes, what sort of challenges can we expect. ?