Storage vs index size, memory foot print etc

Hello,

we are evaluating ES for a production use and trying to prepare some
capacity estimation. During that task I just realized that some terms and
internal mechanics are not clear enough to me and I wasn't able to find a
relevant blog post, SO question which would clarify that for me.

  • Storage size vs index size

    • the difference between those is clear, what is not clear where to find
      them eg. in head of HQ plugin - There is just primarily and total size
      which is clear: Total = Primary size * num of replicas. The reason why I am
      asking is that If I understand correctly Index should be ideally kept in
      memory to ensure an optimal performance. While storage size is fine when
      offloaded to disk. For our project production use we would need 40TB
      "Primary size" as of HQ plugin says. If we should keep that in memory,
      using 68GB servers we would end up 40TB/68GB machines in the cluster which
      would be horrible and there is certainly flaw in my understanding. So
      elementary question is: Where I can find Index and storage size on the
      REST, plugins, eg.?
  • OLAP (slices, dices, aggregation, etc) kind of queries

    • we are required to perform on such data OLAP/ analytics kind of
      queries. Correct me If I am wrong but I expect that all fields we are going
      to be queried/aggregated has to be Indexed. Or just stored?

Hope that the questions make sense
Jakub

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/82563a28-2598-4d93-9820-895ae5a591fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You don't need to store the entire index in memory, that's now how ES works.

Regarding the second point, you index a document's fields which allows you
to search it, storing means you can also return the value of the field if
it is found in the search.

On 27 April 2015 at 23:23, Jakub Stransky stransky.ja@gmail.com wrote:

Hello,

we are evaluating ES for a production use and trying to prepare some
capacity estimation. During that task I just realized that some terms and
internal mechanics are not clear enough to me and I wasn't able to find a
relevant blog post, SO question which would clarify that for me.

  • Storage size vs index size

    • the difference between those is clear, what is not clear where to
      find them eg. in head of HQ plugin - There is just primarily and total size
      which is clear: Total = Primary size * num of replicas. The reason why I am
      asking is that If I understand correctly Index should be ideally kept in
      memory to ensure an optimal performance. While storage size is fine when
      offloaded to disk. For our project production use we would need 40TB
      "Primary size" as of HQ plugin says. If we should keep that in memory,
      using 68GB servers we would end up 40TB/68GB machines in the cluster which
      would be horrible and there is certainly flaw in my understanding. So
      elementary question is: Where I can find Index and storage size on the
      REST, plugins, eg.?
  • OLAP (slices, dices, aggregation, etc) kind of queries

    • we are required to perform on such data OLAP/ analytics kind of
      queries. Correct me If I am wrong but I expect that all fields we are going
      to be queried/aggregated has to be Indexed. Or just stored?

Hope that the questions make sense
Jakub

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/82563a28-2598-4d93-9820-895ae5a591fa%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/82563a28-2598-4d93-9820-895ae5a591fa%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9JGsqtsca6OGTuV7fWSbbCW9%3Dd%3Dnk_ZzzFP%2B4sy5dQSw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.