Large index design question

Sky_Stebnicki1 · April 26, 2012, 5:19pm

Sorry for my late reply, for some reason I was not notified when you
posted. Documents can move from active to archived at any time so we will
know as the document is indexed or updated where it belongs. In our
database we split the records into two separate tables for performance
reasons.

So your suggestion is to create a completely separate index or type? Sorry,
I'm fairly new to the Elasticsearch terminology...

Thanks,

Sky

On Tuesday, April 24, 2012 8:40:37 AM UTC-7, Berkay Mollamustafaoglu wrote:

I'd consider using separate indices and aliases. Keeping the active index
smaller would help with the performance. Will you know before you index
which docs are archived data and which are active?

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Tue, Apr 24, 2012 at 10:21 AM, Sky Stebnicki sky.stebnicki@gmail.comwrote:

Hi all,

We've been testing elasticsearch with our application and are really
enjoying impressive performance and rock-solid stability. I have a design
question about what would be the most efficient way to index (1) large set
of highly active date and (2) an even larger set of archived
data. Basically we have tens of millions of documents but about 80% of them
are in archive state and 10-20% are read/updated 95% of the time.

My question is this: would it be more efficient to store the archived
documents into a separate type like "/index/mydata_arch" or to just use a
filtered query to cache the results and flag the archived documents as we
index them?

We are working on setting up benchmarks to test this ourselves in a
real-word environment but I wanted to ask the experts here too and see if
you had any input.

Thanks so much for your help!

Sky

Topic		Replies	Views
Index design question Elasticsearch	1	300	July 6, 2017
Performance issues using Elasticsearch as a time window storage Elasticsearch	6	426	July 6, 2017
Sharding by time Elasticsearch	16	1546	July 6, 2017
Quering lots of small objects Elasticsearch	6	430	July 6, 2017
Decrease "Real time" latency for large indices Elasticsearch	9	418	July 6, 2017

Large index design question

Related topics