Bad CPU: when is field cache invalidated?


(Crwe) #1

We are experiencing bad gaps in performance, where the ES machine
stops responding for half a minute, consuming all CPUs completely.

We narrowed the problem down to field caches. There seems to be no
problem with free RAM, but despite that, the field cache sometimes
gets invalidated (bigdesk suddenly reports 0B field cache). This leads
to field cache loading -- several GB -- which takes a lot of time and
is the cause of the GC freeze.

Question: under what conditions is the field cache invalidated?

Is there any way to invalidate only a part of it (one segment?)
instead of this mega-freeze + sudden loading from zero to several GB?
Is there a way to pre-load the field cache as soon as possible, and
not wait for user queries? Would setting "soft" cache type help?

Thank you.

--


(Shay Banon) #2

As you index data, the field data needs to be reloaded for the "new data" being indexed, and sometimes (due to merging of old+new data) it need to also be loaded for part of the "old" data. When you restart a node, obviously it gets "invalidated". The way to tackle that is solved by using warmers that will be available in upcoming 0.20 version, which will come out probably next week.

On Oct 2, 2012, at 4:17 AM, Crwe tester.testerus@gmail.com wrote:

We are experiencing bad gaps in performance, where the ES machine
stops responding for half a minute, consuming all CPUs completely.

We narrowed the problem down to field caches. There seems to be no
problem with free RAM, but despite that, the field cache sometimes
gets invalidated (bigdesk suddenly reports 0B field cache). This leads
to field cache loading -- several GB -- which takes a lot of time and
is the cause of the GC freeze.

Question: under what conditions is the field cache invalidated?

Is there any way to invalidate only a part of it (one segment?)
instead of this mega-freeze + sudden loading from zero to several GB?
Is there a way to pre-load the field cache as soon as possible, and
not wait for user queries? Would setting "soft" cache type help?

Thank you.

--

--


(Crwe) #3

On Oct 2, 8:31 pm, Shay Banon kim...@gmail.com wrote:

As you index data, the field data needs to be reloaded for the "new data" being indexed, and sometimes (due to merging of old+new data) it need to also be loaded for part of the "old" data.

What we are experiencing is a sharp drop to 0 bytes (field cache as
reported by bigdesk), not gradual resizing as parts of "new data" or
"old data" get modified. Perhaps the cause could be some mega-segment
merge?

When you restart a node, obviously it gets "invalidated". The way to tackle that is solved by using warmers that will be available in upcoming 0.20 version, which will come out probably next week.

That sounds good. Looking forward to 0.20 then.

On Oct 2, 2012, at 4:17 AM, Crwe tester.teste...@gmail.com wrote:

We are experiencing bad gaps in performance, where the ES machine
stops responding for half a minute, consuming all CPUs completely.

We narrowed the problem down to field caches. There seems to be no
problem with free RAM, but despite that, the field cache sometimes
gets invalidated (bigdesk suddenly reports 0B field cache). This leads
to field cache loading -- several GB -- which takes a lot of time and
is the cause of the GC freeze.

Question: under what conditions is the field cache invalidated?

Is there any way to invalidate only a part of it (one segment?)
instead of this mega-freeze + sudden loading from zero to several GB?
Is there a way to pre-load the field cache as soon as possible, and
not wait for user queries? Would setting "soft" cache type help?

Thank you.

--

--


(Crwe) #4

I should add that we are permanently indexing new items, sometimes
overwriting old items with the same id in the process. So this is a
different scenario to the "append-only log indexing" that I often see
mentioned here.

On Oct 3, 6:44 pm, Crwe tester.teste...@gmail.com wrote:

On Oct 2, 8:31 pm, Shay Banon kim...@gmail.com wrote:

As you index data, the field data needs to be reloaded for the "new data" being indexed, and sometimes (due to merging of old+new data) it need to also be loaded for part of the "old" data.

What we are experiencing is a sharp drop to 0 bytes (field cache as
reported by bigdesk), not gradual resizing as parts of "new data" or
"old data" get modified. Perhaps the cause could be some mega-segment
merge?

When you restart a node, obviously it gets "invalidated". The way to tackle that is solved by using warmers that will be available in upcoming 0.20 version, which will come out probably next week.

That sounds good. Looking forward to 0.20 then.

On Oct 2, 2012, at 4:17 AM, Crwe tester.teste...@gmail.com wrote:

We are experiencing bad gaps in performance, where the ES machine
stops responding for half a minute, consuming all CPUs completely.

We narrowed the problem down to field caches. There seems to be no
problem with free RAM, but despite that, the field cache sometimes
gets invalidated (bigdesk suddenly reports 0B field cache). This leads
to field cache loading -- several GB -- which takes a lot of time and
is the cause of the GC freeze.

Question: under what conditions is the field cache invalidated?

Is there any way to invalidate only a part of it (one segment?)
instead of this mega-freeze + sudden loading from zero to several GB?
Is there a way to pre-load the field cache as soon as possible, and
not wait for user queries? Would setting "soft" cache type help?

Thank you.

--

--


(system) #5