Hundreds of indices missing


(Devon Crouse) #1

After restarting the cluster due to some performance issues, we suddenly
had far fewer indices than before. They aren't closed or
recovering/unallocated, they just don't exist according to the API. The
data files still exist on our data nodes and I believe the data is still
there (we had 2 replicas per shard); it's just as if the cluster state was
corrupted somehow. Is there some way to recover or recreate this
state/metadata for the missing indices? We tried turning on verbose
recovery/gateway logging, but still saw no mention of them.

OS: CentOS 6.3 2.6.32-279.5.2.el6.x86_64 #1 SMP
ES: 0.90.1
Java: Oracle 1.7.0_17-b02
Topology:

  • 3 dedicated masters w/ Zookeeper discovery
  • 2 dedicated clients
  • 6 dedicated data nodes, each with 2 ES instances (to utilize available
    memory and keep heaps < 32G)

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Devon Crouse) #2

We're having some success recovering with the following. A few of them have
had unrecoverable shards, but most are fine:

#!/usr/bin/env bash

[ "$#" -eq 1 ] || { echo "Index required" >&2; exit 1; }
export idx=$1

Rename index directories

find /esn/data -type d -name "${idx}" -exec mv {} {}_bak ; &> /dev/null

Recreate empty index

curl -XPUT "http://localhost:9200/${idx}" &> /dev/null

Allow shards to initialize

sleep 10

Close the empty index

curl -XPOST "http://localhost:9200/${idx}/_close" &> /dev/null

Remove empty index data files

find /esn/data -type d -name "${idx}" -exec rm -rf {} ; &> /dev/null

Restore index data

find /esn/data -type d -name "${idx}_bak" | awk '{ original=$0;
gsub("_bak",""); system("mv "original" "$0) }'

Reopen index

curl -XPOST "http://localhost:9200/${idx}/_open" &> /dev/null

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Devon Crouse) #3

To clarify, the above is run on all data nodes concurrently.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4