Performance issue when using percolators


(Brett Anderson) #1

I'm using the percolator feature in ES to filter documents as they arrive
in real-time. This filter can receive instant requests to modify the set of
percolators as keywords change but it also deletes the index holding the
percolators every hour. After deletion the index is re-created and the
percolators are built from scratch again. This is to ensure that the
percolators never go out of sync with the master list of keywords for very
long.

I've found that the percolators are able to process around 1.5K documents
per second. However, after I delete and re-create the index as described
above the rate of throughput drops to 500 ps, almost 30% of the original
rate. If I re-start ES and then run the exact same deletion and re-creation
the rate goes back to 1.5K (full speed).

Is there any reason why deleting and recreating an index and by extension
its percolators would cause a dip in performance? Again, restarting ES and
then deleting and re-creating solves the problem. If needed I can try and
create a gist to demonstrate this, but it obviously wouldn't be straight
forward so I thought I'd ask first in-case there's a simple answer.

Thanks,
LJ.


(Shay Banon) #2

Thats strange, it does not really matter. Can you list the order of
operations that you do explicitly? (sample curl like deleting the index,
and so on).

On Wed, May 9, 2012 at 10:55 AM, Laser Jesus
brett.anderson.ftw@gmail.comwrote:

I'm using the percolator feature in ES to filter documents as they arrive
in real-time. This filter can receive instant requests to modify the set of
percolators as keywords change but it also deletes the index holding the
percolators every hour. After deletion the index is re-created and the
percolators are built from scratch again. This is to ensure that the
percolators never go out of sync with the master list of keywords for very
long.

I've found that the percolators are able to process around 1.5K documents
per second. However, after I delete and re-create the index as described
above the rate of throughput drops to 500 ps, almost 30% of the original
rate. If I re-start ES and then run the exact same deletion and re-creation
the rate goes back to 1.5K (full speed).

Is there any reason why deleting and recreating an index and by extension
its percolators would cause a dip in performance? Again, restarting ES and
then deleting and re-creating solves the problem. If needed I can try and
create a gist to demonstrate this, but it obviously wouldn't be straight
forward so I thought I'd ask first in-case there's a simple answer.

Thanks,
LJ.


(Andrew[.:at:.]DataFeedFile.com) #3

How many node & shards do you have for this index?
Have you looked to make sure ES is not doing rebalance shards shortly
after
you delete and recreate the index?

Andrew

On May 9, 5:48 am, Shay Banon kim...@gmail.com wrote:

Thats strange, it does not really matter. Can you list the order of
operations that you do explicitly? (sample curl like deleting the index,
and so on).

On Wed, May 9, 2012 at 10:55 AM, Laser Jesus
brett.anderson....@gmail.comwrote:

I'm using the percolator feature in ES to filter documents as they arrive
in real-time. This filter can receive instant requests to modify the set of
percolators as keywords change but it also deletes the index holding the
percolators every hour. After deletion the index is re-created and the
percolators are built from scratch again. This is to ensure that the
percolators never go out of sync with the master list of keywords for very
long.

I've found that the percolators are able to process around 1.5K documents
per second. However, after I delete and re-create the index as described
above the rate of throughput drops to 500 ps, almost 30% of the original
rate. If I re-start ES and then run the exact same deletion and re-creation
the rate goes back to 1.5K (full speed).

Is there any reason why deleting and recreating an index and by extension
its percolators would cause a dip in performance? Again, restarting ES and
then deleting and re-creating solves the problem. If needed I can try and
create a gist to demonstrate this, but it obviously wouldn't be straight
forward so I thought I'd ask first in-case there's a simple answer.

Thanks,
LJ.


(system) #4