Rolling indexes or _ttl


(Ashish Nigam-3) #1

Hi,
I am working on a multi-tenant application. I have to create separate
indexes for each tenant. And I need to maintain data for 3 months in index.

I think there are two ways to implement this requirement.

  1. Use _ttl to auto delete data from index. This would mean creating just
    one index for a tenant with 4 shards and 4 replicas.
  2. Create index for each month. This would mean that I will have to create
    at least 4 indexes per tenant so that I cover 3 months index requirement.
    Then I will maintain one index alias that would allow me to search across
    all 4 indexes. In this strategy, I would only insert data only in present
    month index. At the end of the month, I will create new index for next
    month and delete oldest index. If I implement rolling indexes, I am
    thinking to create indexes with just one shard and one replica.

Which strategy is better in this case, _ttl or rolling indexes?
If rolling index is preferable, it would be good to know disadvantages of
using _ttl strategy.

Thanks
Ashish

--


(David Pilato) #2

I prefer 2/

Drop an index is "removing" index dir which is quick and you release space.
Using TTL is like deleting docs one by one. That means, creating a new version of the document which is empty. But it takes some space.

When ES optimize the index, deleted documents will be removed. Space will be released. You can call optimize API yourself.

So, IMHO, 2/ is the more efficient way to do it.

HTH

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 6 oct. 2012 à 23:17, Ashish Nigam ashnigamtech@gmail.com a écrit :

Hi,
I am working on a multi-tenant application. I have to create separate indexes for each tenant. And I need to maintain data for 3 months in index.

I think there are two ways to implement this requirement.

  1. Use _ttl to auto delete data from index. This would mean creating just one index for a tenant with 4 shards and 4 replicas.
  2. Create index for each month. This would mean that I will have to create at least 4 indexes per tenant so that I cover 3 months index requirement. Then I will maintain one index alias that would allow me to search across all 4 indexes. In this strategy, I would only insert data only in present month index. At the end of the month, I will create new index for next month and delete oldest index. If I implement rolling indexes, I am thinking to create indexes with just one shard and one replica.

Which strategy is better in this case, _ttl or rolling indexes?
If rolling index is preferable, it would be good to know disadvantages of using _ttl strategy.

Thanks
Ashish

--


(system) #3