Slow Percolator Indexing

James_Bathgate_2 · March 3, 2014, 9:47pm

I'm having issues with indexing percolator queries taking a long time to
insert. I created a sample test case:

gist.github.com

https://gist.github.com/julesbravo/9335275

results

Deleting Old Index:
{"acknowledged":true}

Creating Index:
{"acknowledged":true}

Create Percolator Mapping:
{"acknowledged":true}

Create Query Mapping:

This file has been truncated. show original

test.sh

echo "Deleting Old Index: "
curl -XDELETE "http://localhost:9200/merchandising"

echo "
";

echo "Creating Index: " 
curl -XPUT "http://localhost:9200/merchandising" -d '{
	"settings": {
		"index": {

This file has been truncated. show original

As you can see it's taking 1.5s to insert a single percolator.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bb7d4a88-e194-4806-9583-d2dcc197fcfc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Binh_Ly_2 · March 3, 2014, 10:00pm

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51 and
here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","_id":"1","_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bbb5dd8f-281f-48bf-90d5-002d9c8b8b6e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

James_Bathgate_2 · March 3, 2014, 10:14pm

I don't think so. This is on my local dev with nothing else running on the
index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51 and
here are my results:

Insert Percolator:

{"_index":"merchandising","_type":".percolator","_id":"1","_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b7fa0a36-36e4-4276-9ce7-b4af10c103ce%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

James_Bathgate_2 · March 3, 2014, 11:25pm

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on the
index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51 and
here are my results:

Insert Percolator:

{"_index":"merchandising","_type":".percolator","_id":"1","_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · March 3, 2014, 11:40pm

On top of just indexing a document, the top level 'query' field part gets
parsed into a internal Lucene query. However I don't see why this should
take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate james@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","_id":"1",
"_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tx46ABEpAS94-EO7WGui05Y1RSpx%3DaNGMnHuTHgHf4orA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

James_Bathgate_2 · March 4, 2014, 12:33am

Martijn,

Not running low at all.
A regular document takes ~75ms.
gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part gets
parsed into a internal Lucene query. However I don't see why this should
take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate <ja...@b7interactive.com<javascript:>

wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","_id":"1",
"_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/74560785-adc5-4fe1-885c-71179b2b29f1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · March 4, 2014, 9:05am

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate james@b7interactive.com wrote:

Martijn,

Not running low at all.

A regular document takes ~75ms.

gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part gets
parsed into a internal Lucene query. However I don't see why this should
take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%
40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TwgTXRmo4tCsNvbag5%3DUiMj3C4uBSfD5nJiMwnAHhEiYA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

James_Bathgate_2 · March 4, 2014, 3:52pm

Martijn,

I'm using Oracle Java 7u45

On Tuesday, March 4, 2014 1:05:13 AM UTC-8, Martijn v Groningen wrote:

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate <ja...@b7interactive.com<javascript:>

wrote:

Martijn,

Not running low at all.

A regular document takes ~75ms.

gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part
gets parsed into a internal Lucene query. However I don't see why this
should take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%
40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/606b8eff-5085-4970-b10e-17590c19c12b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

mvg · March 4, 2014, 9:45pm

Ok, I thought may be an old jvm version was causing this, but this one is
pretty recent.

I took a better look at indexing percolator queries and there is indeed a
substantial difference in execution time comparing to indexing a regular
document. When I disabled the size calculation (in the code) for percolator
queries the execution between indexing a regular document and an percolator
document is more or less the same.

I opened an issue for this:

github.com/elastic/elasticsearch

The size estimation for percolator queries takes longer than it should

opened 09:45PM - 04 Mar 14 UTC

closed 02:13PM - 29 Dec 14 UTC

martijnvg

The percolator keeps track of the total amount memory being spent on all the que…ries that are in memory, this statistic is then exposed in the node stats, indices stats and cluster stats apis. Each time a percolator query is registered or deleted its size in memory is calculated and that is added or subtracted from the total size in bytes all percolator queries take in memory. The percolator uses Lucene's `RamUsageEstimator` which takes a substantial chunk of the total time being spend on registering a percolator query. I think the total amount of memory spent on queries is a valuable statistic, but it shouldn't add a substantial overhead in registering a percolator query. Therefor I think we should compute the memory spent on percolator queries on the fly in the stats api, but only when the percolator stats are specifically requested. Downside would be that it may take a while for the `memory spent on percolator queries` statistic to estimate. Based on: https://groups.google.com/forum/#!topic/elasticsearch/6bgUvea98Uw

On 4 March 2014 16:52, James Bathgate james@b7interactive.com wrote:

Martijn,

I'm using Oracle Java 7u45

On Tuesday, March 4, 2014 1:05:13 AM UTC-8, Martijn v Groningen wrote:

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate ja...@b7interactive.com wrote:

Martijn,

Not running low at all.

A regular document takes ~75ms.

gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part
gets parsed into a internal Lucene query. However I don't see why this
should take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running
on the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java
1.7_u51 and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40goo
glegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tz2rkid%3D0a1Ar5oS2_R_yo%2BQG32XwYdgsWLdiOxws_O_Q%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

James_Bathgate_2 · March 5, 2014, 12:02am

Thanks.

On Tuesday, March 4, 2014 1:45:29 PM UTC-8, Martijn v Groningen wrote:

Ok, I thought may be an old jvm version was causing this, but this one is
pretty recent.

I took a better look at indexing percolator queries and there is indeed a
substantial difference in execution time comparing to indexing a regular
document. When I disabled the size calculation (in the code) for percolator
queries the execution between indexing a regular document and an percolator
document is more or less the same.

I opened an issue for this:
The size estimation for percolator queries takes longer than it should · Issue #5339 · elastic/elasticsearch · GitHub

On 4 March 2014 16:52, James Bathgate <ja...@b7interactive.com<javascript:>

wrote:

Martijn,

I'm using Oracle Java 7u45

On Tuesday, March 4, 2014 1:05:13 AM UTC-8, Martijn v Groningen wrote:

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate ja...@b7interactive.com wrote:

Martijn,

Not running low at all.

A regular document takes ~75ms.

gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part
gets parsed into a internal Lucene query. However I don't see why this
should take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running
on the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java
1.7_u51 and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40goo
glegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fe29c8e-72e6-440a-b292-f1fbdba2df03%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Inserting query into percolator - long delay getting response Elasticsearch	3	346	July 6, 2017
Percolator inconsistent response times. (ES 1.3.1) Elasticsearch	1	304	July 6, 2017
Logging of percolator reverse queries Elasticsearch	2	400	July 6, 2017
5.x Percolator memory/cache issue Elasticsearch	6	705	May 26, 2017
Elasticsearch Percolcate Perforemance Issue Elasticsearch	1	322	June 5, 2019

Slow Percolator Indexing

Related topics