I want to index 30 days of data, in each day i will get GB's of data
I Have 30 days of data, i have 10 shards, i want to route in such away , that first 3 days of data should go to first shard, next 3 days of data should go to next shard ..etc
I am routing with date field, but i found data belongs to 1, 10,17 is going to 1 st 2, 11, .. going to next shard...etc .
I want to index 30 days of data, in each day i will get GB's of data
I Have 30 days of data, i have 10 shards, i want to route in such away ,
that first 3 days of data should go to first shard, next 3 days of data
should go to next shard ..etc
I am routing with date field, but i found data belongs to 1, 10,17 is going
to 1 st 2, 11, .. going to next shard...etc .
Continuously my application receives data (TB), I have to index every file, and from the GUI I should display the data (matching data)
After 90 days I will delete the 1st day data, and store 91 data in it, for this I am using 11 machines, 1 is master 10 are slaves, so that, I can store first 10 days data in one shard, 10-20 days of data in another shard and so on, when user asks to display data between the dates, I can fire the query on specific shard using routing. Indexing, and searching is easy with this approach, but I am unable to route first 10 days of data in to one shard.
And also I need faster queries, because in 90 days data may be extended to 300-400 TB
So I want to divide the data based on Date, so when a query is fired I ac search in 30 TB.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.