Replicated shard doesn't rebuild


(JCD) #1

Hello everyone,

I'm a real newbie on elastic search.
I played a little with logstash, elasticsearch (with head plugin) and
kibana with two replicated nodes and 1 index splitted into 5 shards on the
2 nodes.

I do my test like a real nag on .deb 0.19.9 version on a debian.

I try to verify if my replication is working fine.

So I create my first node.
I set datas on it and i can see
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash having folders
0 to 4 created.

I create my second node, subscribe to index "stash". Then I can see it also
create and replicate my shards in
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash

If I wildly erase a shard with "rm -rf
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash/1", for
example, my cluster status remains green, and this shard is never
replicated again until I restart the elasticsearch service.
Is it normal ? What can I do to check the replication status or force it to
"rebuild" ?

My test should appear to be wild but isn't it what it should happens with a
corrupted hard drive or raid volume in the real life ?

Thank you,

JCD.

--


(JCD) #2

It seems I didn't wait enough...

After a while, I get this error on my logs :

"[2012-09-28 10:11:41,774][WARN ][index.engine.robin ] [logstash2]
[stash][1] failed to read latest segment infos on flush
java.io.FileNotFoundException:
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash/1/index/segments_80
(No such file or directory)
"
....

Then
[2012-09-28 10:13:07,130][WARN ][index.engine.robin ] [logstash2]
[stash][1] failed to read latest segment infos on flush
org.apache.lucene.index.IndexNotFoundException: no segments* file found in
org.elasticsearch.index.store.Store$StoreDirectory@373ee92
lockFactory=org.apache.lucene.store.NativeFSLockFactory@7ee361ad: files:
[_checksums-1348818101728]
....

Then my shard folders "1" is replicated again.
Is there any timeout? What should I monitor ?

Thank you,
Best regards,
JCD

On Friday, September 28, 2012 10:17:11 AM UTC+2, JCD wrote:

Hello everyone,

I'm a real newbie on elastic search.
I played a little with logstash, elasticsearch (with head plugin) and
kibana with two replicated nodes and 1 index splitted into 5 shards on the
2 nodes.

I do my test like a real nag on .deb 0.19.9 version on a debian.

I try to verify if my replication is working fine.

So I create my first node.
I set datas on it and i can see
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash having folders
0 to 4 created.

I create my second node, subscribe to index "stash". Then I can see it
also create and replicate my shards in
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash

If I wildly erase a shard with "rm -rf
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash/1", for
example, my cluster status remains green, and this shard is never
replicated again until I restart the elasticsearch service.
Is it normal ? What can I do to check the replication status or force it
to "rebuild" ?

My test should appear to be wild but isn't it what it should happens with
a corrupted hard drive or raid volume in the real life ?

Thank you,

JCD.

--


(JCD) #3

Hello,

I can't find how it was been rebuild automatically... then I'm not fully
confident with it. I know I should work with at least 2 replicas (to get
such a raid 6 protection).
What is the action that make shards are rebuilded/rechecked ?

On Friday, September 28, 2012 10:17:11 AM UTC+2, JCD wrote:

Hello everyone,

I'm a real newbie on elastic search.
I played a little with logstash, elasticsearch (with head plugin) and
kibana with two replicated nodes and 1 index splitted into 5 shards on the
2 nodes.

I do my test like a real nag on .deb 0.19.9 version on a debian.

I try to verify if my replication is working fine.

So I create my first node.
I set datas on it and i can see
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash having folders
0 to 4 created.

I create my second node, subscribe to index "stash". Then I can see it
also create and replicate my shards in
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash

If I wildly erase a shard with "rm -rf
/var/lib/elasticsearch/logstashcluster/nodes/0/indices/stash/1", for
example, my cluster status remains green, and this shard is never
replicated again until I restart the elasticsearch service.
Is it normal ? What can I do to check the replication status or force it
to "rebuild" ?

My test should appear to be wild but isn't it what it should happens with
a corrupted hard drive or raid volume in the real life ?

Thank you,

JCD.

--


(system) #4