Capacity Planning with ElasticSearch


(geekox86) #1

Greetings all,

Few questions to help us in the capacity planning:

  1. What is the memory footprint of an empty ElasticSearch index consisting of a single shard and zero replicas?
  2. Is there a difference in footprint between 1 empty index with 1000 shards and 1000 empty indexes each with 1 shard (both with 0 replicas)?
  3. Is it practical to create an ElasticSearch index for each user, if you have millions of them, each with thousands to few million small JSON documents?
  4. Does ElasticSearch require an index shard to be fully (not paged) loaded into memory during querying?
  5. Is using types in a single index different (in query performance) than using multiple indexes per type?

Kind regards

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/72a5f036-c039-43ba-a2ab-c1b84e3c47ab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(geekox86) #2

Any experienced dude with ES to answer this??

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3e238739-2928-4aed-9883-094452bc6c11%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #3

1 and 2 - It'd probably be easiest to try this yourself :slight_smile:
3 - not really, you should look into routing.
4 - only the index metadata is stored in memory. However doing aggregations
will pull the applicable data into memory.
5 - not sure.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 23 April 2014 13:04, Mohannad Saeed geekox86@gmail.com wrote:

Any experienced dude with ES to answer this??

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3e238739-2928-4aed-9883-094452bc6c11%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bXJieGyw2mJccr8ytNLMBRqV1sCUQO2j5ChG8n_9O%2Bjw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Tim Uckun) #4

To follow up on this...

As a general rule is it better to have one horse size index or a hundred
duck sized indices. I am thinking about those types of searches where you
might frequently search a subset of the data. For example keeping a
separate index for every customer because normally the app restricts itself
to only dealing with one customer at a time. Perhaps doing a compound
split based on customer and year if your searches rarely go outside of the
current year.

Thanks.

On Wednesday, April 23, 2014 4:07:58 PM UTC+12, Mark Walkom wrote:

1 and 2 - It'd probably be easiest to try this yourself :slight_smile:
3 - not really, you should look into routing.
4 - only the index metadata is stored in memory. However doing
aggregations will pull the applicable data into memory.
5 - not sure.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 23 April 2014 13:04, Mohannad Saeed <geek...@gmail.com <javascript:>>wrote:

Any experienced dude with ES to answer this??

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3e238739-2928-4aed-9883-094452bc6c11%40googlegroups.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28d1e12a-a381-4921-b13e-83640767e281%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #5

It depends - on your data set, your queries, your cluster specs.Having tens
to hundreds of thousands (or millions) of indexes will have a performance
impact that will only increase with numbers, so the lower you can keep it
though planning the better. But to counter that, the bigger your indexes,
the longer it will take to query and you have a reduced agility to
manipulate said indexes :wink:
Which is why the answer to a lot of this sort of thing is - it depends.

As an example of planning, it might be better to think ahead if you are
aiming for such large sizes and give your app the ability to talk to
multiple clusters, which will allow you to move customers into high/low
performance/capacity clusters.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 23 April 2014 19:39, Tim Uckun timuckun@gmail.com wrote:

To follow up on this...

As a general rule is it better to have one horse size index or a hundred
duck sized indices. I am thinking about those types of searches where you
might frequently search a subset of the data. For example keeping a
separate index for every customer because normally the app restricts itself
to only dealing with one customer at a time. Perhaps doing a compound
split based on customer and year if your searches rarely go outside of the
current year.

Thanks.

On Wednesday, April 23, 2014 4:07:58 PM UTC+12, Mark Walkom wrote:

1 and 2 - It'd probably be easiest to try this yourself :slight_smile:
3 - not really, you should look into routing.
4 - only the index metadata is stored in memory. However doing
aggregations will pull the applicable data into memory.
5 - not sure.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 23 April 2014 13:04, Mohannad Saeed geek...@gmail.com wrote:

Any experienced dude with ES to answer this??

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/3e238739-2928-4aed-9883-094452bc6c11%
40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/28d1e12a-a381-4921-b13e-83640767e281%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/28d1e12a-a381-4921-b13e-83640767e281%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z6DbinW8Z%3Dr6RACqHSBnYCOwv245VsSQ4Wi4F-GSkZ_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6