Downsides of over-allocating shards?


(Otis Gospodnetić) #1

Hi,

Shard over-allocation was nicely covered in some recent threads.
Assuming one uses routing and filtering to limit queries to a specific
shard, can one over-allocate shards to the point of causing problems?

For example:

Say that you determine that having 20 shards in some cluster would
work well right now, but you want to plan for growth over the next 12
months, so you opt to go with 100 shards.
But what if you are super certain about growth or simply want to plan
for more than the next 12 months. Why not just choose, say, 1000
shards?

Is there anything that can go wrong? Any negative side effects one
should be aware or?

Thanks,
Otis


(Shay Banon) #2

Only the overhead a shard has, which is a full Lucene index (memory, file descriptors). It really depends on the HW you have and what you do with it.

On Wednesday, January 25, 2012 at 9:12 AM, Otis Gospodnetic wrote:

Hi,

Shard over-allocation was nicely covered in some recent threads.
Assuming one uses routing and filtering to limit queries to a specific
shard, can one over-allocate shards to the point of causing problems?

For example:

Say that you determine that having 20 shards in some cluster would
work well right now, but you want to plan for growth over the next 12
months, so you opt to go with 100 shards.
But what if you are super certain about growth or simply want to plan
for more than the next 12 months. Why not just choose, say, 1000
shards?

Is there anything that can go wrong? Any negative side effects one
should be aware or?

Thanks,
Otis


(phobos182) #3

We are in the process of reversing a decision we made along these times. The duplication of the TermInfoIndex had a pretty good size memory cost. Reducing the number of shards reduced memory pressure on the nodes so more memory was available for caching.

IMHO there is a pretty good size memory cost loading the index data into memory. So we try to use a few shards as possible to have search results return in an acceptable time.


(phobos182) #4

One issue we recently ran into Shay. I wanted to get your feedback. In our development environment we went 100% the other way, and decreased the number of indexes, shards, and increased our tiered merging policy from 5g to 20gb to reduce segments in the system. This resulted an increase in segments size, and de-duplicated a lot of the TermInfoIndex items.

After doing this our memory looked great! We did not have to touch the TermIndex Divisor, or anything like that. As an unintended side effect, search / query time went through the roof. Looking at some stack traces, a LOT of time was being spent performing the searches themselves Example:

org.apache.lucene.index.SegmentTermPositions.lazySkip(SegmentTermPositions.java:169)

It looks like a single threaded operations (1 CPU). So my question is this. Do we increase the number of shards to increase CPU concurrency thus decreasing search time and making it more performant? For example:

2 servers. 12 CPU's each.

1 Index: 24 shards evenly balanced.
A search operation will optimistically (if allocated fair) use all 24 CPU's in the system. The cost in RAM increases quite a bit as TermInfo's are duplicated, more segments, more stuff to load into memory.

1 Index: 2 shards evenly balanced.
A search operation will use 2 CPU's in the system. RAM is reduced quite a bit, segments decrease, etc..

Is this correct?

Thanks!


(Otis Gospodnetić) #5

Hi,

On Feb 1, 1:27 pm, phobos182 phobos...@gmail.com wrote:

One issue we recently ran into Shay. I wanted to get your feedback. In our
development environment we went 100% the other way, and decreased the number
of indexes, shards, and increased our tiered merging policy from 5g to 20gb
to reduce segments in the system. This resulted an increase in segments
size, and de-duplicated a lot of the TermInfoIndex items.

After doing this our memory looked great! We did not have to touch the
TermIndex Divisor, or anything like that. As an unintended side effect,
search / query time went through the roof. Looking at some stack traces, a
LOT of time was being spent performing the searches themselves Example:

org.apache.lucene.index.SegmentTermPositions.lazySkip(SegmentTermPositions. java:169)

It looks like a single threaded operations (1 CPU). So my question is this.
Do we increase the number of shards to increase CPU concurrency thus
decreasing search time and making it more performant? For example:

Yes. :slight_smile:
It's faster for a 12 core box to search 12 smaller indices in parallel
than 1 large index. The total time for the former will be about 1/12
of the latter.

2 servers. 12 CPU's each.

1 Index: 24 shards evenly balanced.
A search operation will optimistically (if allocated fair) use all 24 CPU's
in the system. The cost in RAM increases quite a bit as TermInfo's are
duplicated, more segments, more stuff to load into memory.

1 Index: 2 shards evenly balanced.
A search operation will use 2 CPU's in the system. RAM is reduced quite a
bit, segments decrease, etc..

Is this correct?

Yes, I think so.

Otis


(system) #6