Can a heavy query cause index corruption?


(Jaime López) #1

Hi

I've been running some aggresive tests (on purposely reduced hardware specs) and faced the following issue. My cluster is:

  • ES 2.2.0 on CentOS 6.7
  • 4x nodes with 4GB RAM (2GB for the JVM), 8 cores.
  • Extended the bulk thread queue size from 50 to 1000
  • 20x clients running on the REST api, sending bulk indexing requests with 1000 records each

Then, I've run a few heavy aggregates. The idea is to force ES to retrieve the data from disk (avoiding FS cache whatsoever) to measure some sort of "worst case scenario"

One of the nodes crashed due to OutOfMemory exception during one of the aggregates.** This caused corruption on 100% shards of that node**! Now the cluster is relocating shards but it's painfully slow.

I find hard to believe that a query - as heavy as it is - can cause full corruption of a node. This would be a no-go for production in the project I'm working at.

Is there a way to ensure that the queries memory does not take down the full node? I've been wondering if lowering the three available circuit breakers so they don't sum more than 100% will help (https://www.elastic.co/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html) :

But I find this really annoying - lowering these limits "just in case" doesn't seem reasonable.

Any other ideas / tips?

The full trace can be found here: http://pastie.org/pastes/10722735/text?key=8jrwowaumantpoqjvctq


(Mark Walkom) #2

2GB of heap isn't very much at all :frowning:


(Jaime López) #3

Yes, it is not, on purpose. But it gives an idea of what will happen on production when I will have 16GB of heaps and the query is 8x more complex (I've queried 1 month of data and I expect to be able to do the same over 1 year)


(Michael McCandless) #4

OutOfMemoryError should never result in corruption.

Can you share the full stack traces of both the original OOME you hit and the resulting corruption exceptions?


(Jaime López) #5

The stack trace is on post #1 (pastie)


(Michael McCandless) #6

The stack trace is on post #1 (pastie)

Woops, sorry, I missed that.

OK I looked at the exceptions, and this is not actually index corruption.

Rather, ES has decided that this shard is in an unknown state ("failed engine"), no longer in sync with its peers, and therefore must hard-close it and recopy the shard to sync up again, which is what you see happening.

I think you do need to use the circuit breakers to guard against this.


(system) #7