Decoding the on-disk representation of an ES index


(Ajith Ramanathan) #1

Hey guys. I have two instances of what should be the same index (one of
those datacenter redundancy things). I send the same updates to both DCs.
However the two indices appear to be different (for example the same query
returns a different set of results and has a different number of hits).
This might have something to do with deviations in the mappings (I've been
hacking on it quite a bit). In particular, I sent identical updates to
both DCs but the mappings were slightly different (for example, only one DC
had a particular nested object in its mapping). We allow dynamic mapping.

I'm at a loss to figure out how to debug this. I'd like the ability to
diff the two indices and obtain a sampling of the differences. I'd also
like to examine the transaction log pertaining to a given document to check
if something funky happened. It would also be helpful if I could dump
portions of the index directly. Is there a command line tool that would
let me dump portions of the index in a decoded form? I'm looking for
things like posting lists, dictionaries, bloom filter contents etc.

Or there might be a nicer way to figure out what happened to the two
indices.

Cheers,
Ajith

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f4974b97-5db8-41b2-b63f-7588998d8094%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2