I have millions of metrics for ~1 hundred sources (the number of sources will raise in the future).
A query is always for a single source.
I'm already splitting the indices per day.
I'm trying to decide between multiple indices - one per source vs a single index with the source as a term in the document (in this case, all queries will use a term filter with the source id).
Is there a performance difference between the 2 above? during index? during query?
The cluster is write heavy- thousands of index requests (that translate to tens of bulk requests) per second.
Thank you @dadoonet.
All docs are of the same type and have the same fields so I guess the answer is single index.
Does this answer based on better maintainability, or there are performance considerations as well?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.