Slow Percolator Indexing

I'm having issues with indexing percolator queries taking a long time to
insert. I created a sample test case:

As you can see it's taking 1.5s to insert a single percolator.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bb7d4a88-e194-4806-9583-d2dcc197fcfc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51 and
here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","_id":"1","_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bbb5dd8f-281f-48bf-90d5-002d9c8b8b6e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I don't think so. This is on my local dev with nothing else running on the
index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51 and
here are my results:

Insert Percolator:

{"_index":"merchandising","_type":".percolator","_id":"1","_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b7fa0a36-36e4-4276-9ce7-b4af10c103ce%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on the
index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51 and
here are my results:

Insert Percolator:

{"_index":"merchandising","_type":".percolator","_id":"1","_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On top of just indexing a document, the top level 'query' field part gets
parsed into a internal Lucene query. However I don't see why this should
take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate james@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","_id":"1",
"_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tx46ABEpAS94-EO7WGui05Y1RSpx%3DaNGMnHuTHgHf4orA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Martijn,

  1. Not running low at all.
  2. A regular document takes ~75ms.
  3. gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part gets
parsed into a internal Lucene query. However I don't see why this should
take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate <ja...@b7interactive.com<javascript:>

wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","_id":"1",
"_version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/74560785-adc5-4fe1-885c-71179b2b29f1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate james@b7interactive.com wrote:

Martijn,

  1. Not running low at all.
  2. A regular document takes ~75ms.
  3. gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part gets
parsed into a internal Lucene query. However I don't see why this should
take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%
40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TwgTXRmo4tCsNvbag5%3DUiMj3C4uBSfD5nJiMwnAHhEiYA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Martijn,

I'm using Oracle Java 7u45

On Tuesday, March 4, 2014 1:05:13 AM UTC-8, Martijn v Groningen wrote:

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate <ja...@b7interactive.com<javascript:>

wrote:

Martijn,

  1. Not running low at all.
  2. A regular document takes ~75ms.
  3. gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part
gets parsed into a internal Lucene query. However I don't see why this
should take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running on
the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java 1.7_u51
and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%
40googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/606b8eff-5085-4970-b10e-17590c19c12b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ok, I thought may be an old jvm version was causing this, but this one is
pretty recent.

I took a better look at indexing percolator queries and there is indeed a
substantial difference in execution time comparing to indexing a regular
document. When I disabled the size calculation (in the code) for percolator
queries the execution between indexing a regular document and an percolator
document is more or less the same.

I opened an issue for this:

On 4 March 2014 16:52, James Bathgate james@b7interactive.com wrote:

Martijn,

I'm using Oracle Java 7u45

On Tuesday, March 4, 2014 1:05:13 AM UTC-8, Martijn v Groningen wrote:

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate ja...@b7interactive.com wrote:

Martijn,

  1. Not running low at all.
  2. A regular document takes ~75ms.
  3. gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part
gets parsed into a internal Lucene query. However I don't see why this
should take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running
on the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java
1.7_u51 and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40goo
glegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tz2rkid%3D0a1Ar5oS2_R_yo%2BQG32XwYdgsWLdiOxws_O_Q%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks.

On Tuesday, March 4, 2014 1:45:29 PM UTC-8, Martijn v Groningen wrote:

Ok, I thought may be an old jvm version was causing this, but this one is
pretty recent.

I took a better look at indexing percolator queries and there is indeed a
substantial difference in execution time comparing to indexing a regular
document. When I disabled the size calculation (in the code) for percolator
queries the execution between indexing a regular document and an percolator
document is more or less the same.

I opened an issue for this:
The size estimation for percolator queries takes longer than it should · Issue #5339 · elastic/elasticsearch · GitHub

On 4 March 2014 16:52, James Bathgate <ja...@b7interactive.com<javascript:>

wrote:

Martijn,

I'm using Oracle Java 7u45

On Tuesday, March 4, 2014 1:05:13 AM UTC-8, Martijn v Groningen wrote:

I see that you a lot of time is spend on just measuring how memory the
query takes in memory and not parsing the query. I think this slowness
might be jvm version dependent, what jvm version are you using?

On 4 March 2014 01:33, James Bathgate ja...@b7interactive.com wrote:

Martijn,

  1. Not running low at all.
  2. A regular document takes ~75ms.
  3. gist:9337810 · GitHub

It looks like it's definitely CPU bound.

On Monday, March 3, 2014 3:40:05 PM UTC-8, Martijn v Groningen wrote:

On top of just indexing a document, the top level 'query' field part
gets parsed into a internal Lucene query. However I don't see why this
should take a long time.

Some questions:
Are you running low on jvm memory?
How long does it take to index a regular document?
Can you run the hot threads api while indexing a percolator document?

Martijn

On 4 March 2014 00:25, James Bathgate ja...@b7interactive.com wrote:

What are the likely bottlenecks on indexing percolators? This is on a
Vagrant virtual machine with 2 GBs of RAM with 1GB for ES. I'm wondering if
this won't be a problem on an EC2 instance.

On Monday, March 3, 2014 2:14:34 PM UTC-8, James Bathgate wrote:

I don't think so. This is on my local dev with nothing else running
on the index.

On Monday, March 3, 2014 2:00:53 PM UTC-8, Binh Ly wrote:

Hmmmm not sure, I just tried this on ES 1.0.1 and Oracle Java
1.7_u51 and here are my results:

Insert Percolator:
{"_index":"merchandising","_type":".percolator","id":"1","
version":1,"created":true}
real 0m0.103s
user 0m0.015s
sys 0m0.000s

Is it possible that your index was just busy?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/5dc16213-ba8e-4d68-9518-0882bd26e294%40goo
glegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fe29c8e-72e6-440a-b292-f1fbdba2df03%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.