How to start out with a new cluster regarding shards?

I'll soon start a new cluster with latest 6.x series, here are some of the key points:

  • one index
  • will use two "types" because I need the join capability
  • will have ~22mio parents and ~240mio children documents
  • parent documents have more fields and will be "bigger" in general
  • the overall size of the indexed will be ~300GB initially
  • number of nodes isn't yet decided
  • I would classify the growth as "slow but linear"
  • update frequency is multiple times per second and it's likely that the update to existing documents will be in the same ratio to the insertion of completely new documents

I understand I will need to run benchmarks any way but upfront would like to get a feeling what an appropriate shard size could be. The default of 5 doesn't sound good to me, having a single shard size of ~60GB etc.

Any suggestion where to start? Are 50 shards ȧ 6GB too many of them? I understand this can speed up indexing but will slow down searching as it has to wait on results of 50 shards.

If someone has experience with a similar sized set up and would like to share the insights, that would be great.

thanks,

  • Markus

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.