Lost index metadata and overwriting pre-existing index files


(Danny Berger) #1

Hi - I recently experienced some surprising elasticsearch behavior and I'd
appreciate some verification on the "whys" behind what we saw. Basically,
during a cluster restart we lost some index metadata causing those indices
to not be realized and loaded from the data nodes (raw index files still
existed on disk), then, before we realized that and had a chance to recover
them, new incoming data caused the cluster to create new indices under the
same names, completely overwriting the original, raw index data on disk
(clearing out and losing a lot of data). If that's unclear or for further
details, I've posted the scenario and straightforward steps to reproduce:
https://github.com/dpb587/elasticsearch-lost-index.

These are my core questions...

  1. Is it true that index metadata (sharding size, mapping, etc) will only
    ever be stored on master-capable nodes? Previously, my understanding of the
    master was that it was primarily responsible for managing cluster state and
    coordinating cluster balancing, not persisting index metadata. (I'm not
    arguing it doesn't necessarily make sense, just that I didn't realize
    "cluster state" included the index metadata)

  2. Is there documentation on elasticsearch.org which more precisely defines
    the responsibilities of master and data nodes? The only vague references
    I've come across are
    http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/modules-node.html,
    the elasticsearch default configuration file, and various non-authoritative
    blog posts and Stack Overflow answers, none of which prompted me to realize
    data nodes would not hold their own metadata.

  3. Is it true that elasticsearch (Lucene?) will overwrite existing data
    files without error or warning if the cluster is not aware of the index? If
    so, is there a way to disable that behavior to avoid accidental data loss
    due to misconfiguration (aside from the broad action.auto_create_index
    setting)? If not, is there anything else which would explain the behavior
    we saw?

Thank you for your time!

Danny

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9407e415-db8f-461d-b04f-027fda4f5c9c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Randall Williams) #2

Just experienced something similar. We had a VM instance fail along with an issue with our storage. On restart Elasticsearch created new indexes, but will not load up the old indexes. At this point I'm not sure how to recover from this since there doesn't appear to be away to reload the previously indexed data.


(Harlin) #3

The solution to this is to have backup master nodes. Running a cluster with only 1 master eligible node is asking for trouble.


(system) #4