We've been testing elasticsearch with our application and are really
enjoying impressive performance and rock-solid stability. I have a design
question about what would be the most efficient way to index (1) large set
of highly active date and (2) an even larger set of archived
data. Basically we have tens of millions of documents but about 80% of them
are in archive state and 10-20% are read/updated 95% of the time.
My question is this: would it be more efficient to store the archived
documents into a separate type like "/index/mydata_arch" or to just use a
filtered query to cache the results and flag the archived documents as we
We are working on setting up benchmarks to test this ourselves in a
real-word environment but I wanted to ask the experts here too and see if
you had any input.
Thanks so much for your help!