Rapidly create, query and delete or purge on 100s of small indices


My usecase requires me to index small volumes of varied data (less than 500 documents with about 20 fields) as separate indices, query on them and immediately destroy them or delete the data in them.

Following steps occurring back-to-back
Step 1: (One-Time setup or Everytime is OK) Create an index/mapping with dynamic set of 10-30 fields.
Step 2: Index a few 100 documents OR less
Step 3: Do some ad hoc querying and aggregations after an on demand refresh
Step 4 (Optional): Destroy the Index for recreation later or delete the data
All of this in a matter of a minute or two.

My question is that I need to do this whole 4 steps for 100s of index/mappings (dynamically) over a day with atleast 10 of them happening in parallel at any time - What factors should I consider on sharding, resource allocation and index creation.

Any inputs you can provide is truly appreciated.

Every time you create or delete an index or simply alter the mappings, the cluster state will need to be updated and propagated across the cluster. In order to ensure consistency, this is generally done in a single thread, which may very well end up becoming the bottleneck. It is an unusual way to work with Elasticsearch, so I would recommend you testing/benchmarking it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.