I read some where that max recommended shard size is about 50G.
What are biggest bad consequences when shard size is too large.
I have, for some indice, shard size more than 100G on a jvm with 29G fixed
heap size. I suspect this is one factor that caused OutofMemory problem.
However I don't know if large shard size will results in memory issue. I
can only think of longer seek time.
Because you have to load data from the shard when you get a query, so the
larger the shard the more data you load, and OOM or slower response times
happen.
It also helps recovery and reallocation if they are smaller.
I read some where that max recommended shard size is about 50G.
What are biggest bad consequences when shard size is too large.
I have, for some indice, shard size more than 100G on a jvm with 29G
fixed heap size. I suspect this is one factor that caused OutofMemory
problem.
However I don't know if large shard size will results in memory issue. I
can only think of longer seek time.
Thanks Mark,
I understand the benefit for recovery and allocation. But could you help
me a bit more about how much shards data needs to be loaded into heap at
once when doing query? I thought it is the segments that need to be
loaded/examined. If size of segment is limited and you load segment one
at time (or you don't need load all segments at once ) in a shard, then it
seems it is possible that large shard size may not have to increase the
memory footprint.
Regards,
Jack
Because you have to load data from the shard when you get a query, so the
larger the shard the more data you load, and OOM or slower response times
happen.
It also helps recovery and reallocation if they are smaller.
I read some where that max recommended shard size is about 50G.
What are biggest bad consequences when shard size is too large.
I have, for some indice, shard size more than 100G on a jvm with 29G
fixed heap size. I suspect this is one factor that caused OutofMemory
problem.
However I don't know if large shard size will results in memory issue. I
can only think of longer seek time.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.