Clarify difference between opening/closing an index and a shard becoming active/inactive?

Matt1 · November 29, 2011, 9:15am

I see in the logs that shards will become inactive "index-wise", and I was
wondering if anybody could explain in a little detail what that means? In
particular:

Is the shard is still available for searching? I'm assuming that's what
the log message implies.
How long does it take for a shard to become active again?
Because (I'm assuming) it's still available for searching, it means it's
still in memory?
I can see that there's a timer based management, where by default it's
marked inactive after 30m, and this is configurable. Are there any other
more hands-on (programmatic) management tools? I combed the docs pretty
thoroughly but didn't see any.
What constitutes activity from this perspective? I'm assuming calculated
from the last time a document was indexed to that shard, but I'm not sure.
What's the performance tradeoffs for tweaking the inactivity timer
shorter or longer?

I tried to list specific questions, but they all point to a general lack of
knowledge about this part of ES, so I would love any kind of more general
explanation of the entire mechanism.

Cheers!
Matt

kimchy · November 29, 2011, 4:44pm

Are you asking about closing an index, and at what stage the shards exists
for it? If so, here is the answer, if not, explain the background better....

When you close an index, they take no resources on the cluster, except for
disk size. Their data remain on the nodes, but, they are not available to
search or index, and no resources (memory, file handles, ...).

When you open an index, the first thing that will happen will be allocating
the primary shards to the latest version that exists on specific nodes. A
Lucene index will be opened (can take time), and the transaction log will
be replayed. Then, the replica (if you have) will be allocated and sync'ed
against the primaries (reusing existing nodes with the same data).

So, what is the cost of opening an index? The main cost is opening the
shard index files (Lucene wise), and applying the transaction log. You
can't really change the time it takes to open the Lucene index (unless you
index less data / fields), but you can send flush to the index before
closing it, so there won't be a need to replay the transaction log.

The "inactive" part that you see in the log is simply indexing buffer
management among shards within the same node, nothing more. A smaller
indexing buffer size is allocated for inactive shards (shards that have not
been indexed to for a long time).

-shay.banon

On Tue, Nov 29, 2011 at 11:15 AM, Matt matt.chu@gmail.com wrote:

I see in the logs that shards will become inactive "index-wise", and I was
wondering if anybody could explain in a little detail what that means? In
particular:

Is the shard is still available for searching? I'm assuming that's what
the log message implies.

How long does it take for a shard to become active again?

Because (I'm assuming) it's still available for searching, it means
it's still in memory?

I can see that there's a timer based management, where by default it's
marked inactive after 30m, and this is configurable. Are there any other
more hands-on (programmatic) management tools? I combed the docs pretty
thoroughly but didn't see any.

What constitutes activity from this perspective? I'm assuming
calculated from the last time a document was indexed to that shard, but I'm
not sure.

What's the performance tradeoffs for tweaking the inactivity timer
shorter or longer?

I tried to list specific questions, but they all point to a general lack
of knowledge about this part of ES, so I would love any kind of more
general explanation of the entire mechanism.

Cheers!
Matt

Matt1 · December 1, 2011, 5:35am

Got it. I think that mostly answers my question. Will keep browsing the
code. Cheers!

Topic		Replies	Views
ElasticSearch - Active shards Elasticsearch	1	2227	March 28, 2017
About Elasticsearch Reference [5.6] » Document APIs » Index API Elasticsearch	1	993	November 10, 2017
Active shards after closing indices Elasticsearch ilm-index-lifecycle-management	6	1021	June 5, 2020
Better understanding Lucene/Shard overheads Elasticsearch	4	1879	July 6, 2017
Timeouts on shard relocations and overhead on garbage collection Elasticsearch	4	323	May 13, 2021

Clarify difference between opening/closing an index and a shard becoming active/inactive?

Related topics