Performance of Indexed-Shape Queries Vs Geoshape Queries


(iparipsa) #1

Hi,

We ran tests comparing performance of Indexed-Shape Queries to custom
Geoshape Queries. We found that Elasticsearch yielded roughly same results
in both cases. We expected Indexed Shape queries to be faster than custom
Geoshape queries. Our understanding is that Elasticsearch has to convert
the custom geoshapes to quadtree on the fly as opposed to having it
pre-generated. I was wondering if anyone could let us know why there is no
difference in performance between these two query types.

Experiment Design

We indexed suburb boundary geometries into one doctype, and geocoded points
of interest (POIs) into another. We picked top 20 suburbs with geometries
that have most vertices, and ran two following queries for each suburb
geometry.

Geoshape Query

GET /spike_index/doc_type_pois/_search
{
"query": {
"geo_shape": {
"field_geocode": {
"shape": {
"type": "polygon",
"coordinates": [ ]
}
}
}
}
}

Indexed-Shape Query

GET /spike_index/doc_type_pois/_search
{
"query": {
"geo_shape": {
"field_geocode": {
"indexed_shape": {
"id": "",
"type": "doc_type_suburb_quadtree",
"index": "spike_index",
"path": "field_geometry"
}
}
}
}
}

The test was carried out using Siege from a box located within the same VPC
as the Elasticsearch instances. Please find the results below.

Indexed-Shape Query Results

Transactions: 749559 hits
Availability: 100.00 %
Elapsed time: 602.80 secs
Data transferred: 10342.97 MB
Response time: 0.01 secs
Transaction rate: 1243.46 trans/sec
Throughput: 17.16 MB/sec
Concurrency: 14.92
Successful transactions: 749559
Failed transactions: 0
Longest transaction: 5.01
Shortest transaction: 0.00

Geoshape Query Results

Transactions: 723894 hits
Availability: 100.00 %
Elapsed time: 599.16 secs
Data transferred: 9988.83 MB
Response time: 0.01 secs
Transaction rate: 1208.18 trans/sec
Throughput: 16.67 MB/sec
Concurrency: 14.92
Successful transactions: 723894
Failed transactions: 0
Longest transaction: 1.02
Shortest transaction: 0.00

If anyone could shed some light on why the results of these queries are the
same that would be very helpful.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bfebad47-fd6d-45fe-8bca-97eb14199dad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Reelsen) #2

Hey,

the main difference is basically the network overhead. What happens behind
the curtains is that a GET request for the shape is being executed if you
specify it in the request and then this shape is used instead of the
provided one.

Makes sense?

--Alex

On Tue, Apr 15, 2014 at 6:50 AM, iparipsa@thoughtworks.com wrote:

Hi,

We ran tests comparing performance of Indexed-Shape Queries to custom
Geoshape Queries. We found that Elasticsearch yielded roughly same results
in both cases. We expected Indexed Shape queries to be faster than custom
Geoshape queries. Our understanding is that Elasticsearch has to convert
the custom geoshapes to quadtree on the fly as opposed to having it
pre-generated. I was wondering if anyone could let us know why there is
no difference in performance between these two query types.

Experiment Design

We indexed suburb boundary geometries into one doctype, and geocoded
points of interest (POIs) into another. We picked top 20 suburbs with
geometries that have most vertices, and ran two following queries for each
suburb geometry.

Geoshape Query

GET /spike_index/doc_type_pois/_search
{
"query": {
"geo_shape": {
"field_geocode": {
"shape": {
"type": "polygon",
"coordinates": [ ]
}
}
}
}
}

Indexed-Shape Query

GET /spike_index/doc_type_pois/_search
{
"query": {
"geo_shape": {
"field_geocode": {
"indexed_shape": {
"id": "",
"type": "doc_type_suburb_quadtree",
"index": "spike_index",
"path": "field_geometry"
}
}
}
}
}

The test was carried out using Siege from a box located within the same
VPC as the Elasticsearch instances. Please find the results below.

Indexed-Shape Query Results

Transactions: 749559 hits
Availability: 100.00 %
Elapsed time: 602.80 secs
Data transferred: 10342.97 MB
Response time: 0.01 secs
Transaction rate: 1243.46 trans/sec
Throughput: 17.16 MB/sec
Concurrency: 14.92
Successful transactions: 749559
Failed transactions: 0
Longest transaction: 5.01
Shortest transaction: 0.00

Geoshape Query Results

Transactions: 723894 hits
Availability: 100.00 %
Elapsed time: 599.16 secs
Data transferred: 9988.83 MB
Response time: 0.01 secs
Transaction rate: 1208.18 trans/sec
Throughput: 16.67 MB/sec
Concurrency: 14.92
Successful transactions: 723894
Failed transactions: 0
Longest transaction: 1.02
Shortest transaction: 0.00

If anyone could shed some light on why the results of these queries are
the same that would be very helpful.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bfebad47-fd6d-45fe-8bca-97eb14199dad%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/bfebad47-fd6d-45fe-8bca-97eb14199dad%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8LaaFdzazyaNrfWV8wRydduNX57kFU2w_6pw5-O2Gabg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(iparipsa) #3

Hi Alex,

Thanks for your response.

Does this mean that the shape that I query by does not need to be indexed
by Elasticsearch on the fly? Or does this mean that the indexing of the
shape is so quick it does not affect the query latency?

Thank you,
Ilya.

On 21 April 2014 22:46, Alexander Reelsen alr@spinscale.de wrote:

Hey,

the main difference is basically the network overhead. What happens behind
the curtains is that a GET request for the shape is being executed if you
specify it in the request and then this shape is used instead of the
provided one.

Makes sense?

--Alex

On Tue, Apr 15, 2014 at 6:50 AM, iparipsa@thoughtworks.com wrote:

Hi,

We ran tests comparing performance of Indexed-Shape Queries to custom
Geoshape Queries. We found that Elasticsearch yielded roughly same results
in both cases. We expected Indexed Shape queries to be faster than custom
Geoshape queries. Our understanding is that Elasticsearch has to convert
the custom geoshapes to quadtree on the fly as opposed to having it
pre-generated. I was wondering if anyone could let us know why there is
no difference in performance between these two query types.

Experiment Design

We indexed suburb boundary geometries into one doctype, and geocoded
points of interest (POIs) into another. We picked top 20 suburbs with
geometries that have most vertices, and ran two following queries for each
suburb geometry.

Geoshape Query

GET /spike_index/doc_type_pois/_search
{
"query": {
"geo_shape": {
"field_geocode": {
"shape": {
"type": "polygon",
"coordinates": [ ]
}
}
}
}
}

Indexed-Shape Query

GET /spike_index/doc_type_pois/_search
{
"query": {
"geo_shape": {
"field_geocode": {
"indexed_shape": {
"id": "",
"type": "doc_type_suburb_quadtree",
"index": "spike_index",
"path": "field_geometry"
}
}
}
}
}

The test was carried out using Siege from a box located within the same
VPC as the Elasticsearch instances. Please find the results below.

Indexed-Shape Query Results

Transactions: 749559 hits
Availability: 100.00 %
Elapsed time: 602.80 secs
Data transferred: 10342.97 MB
Response time: 0.01 secs
Transaction rate: 1243.46 trans/sec
Throughput: 17.16 MB/sec
Concurrency: 14.92
Successful transactions: 749559
Failed transactions: 0
Longest transaction: 5.01
Shortest transaction: 0.00

Geoshape Query Results

Transactions: 723894 hits
Availability: 100.00 %
Elapsed time: 599.16 secs
Data transferred: 9988.83 MB
Response time: 0.01 secs
Transaction rate: 1208.18 trans/sec
Throughput: 16.67 MB/sec
Concurrency: 14.92
Successful transactions: 723894
Failed transactions: 0
Longest transaction: 1.02
Shortest transaction: 0.00

If anyone could shed some light on why the results of these queries are
the same that would be very helpful.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bfebad47-fd6d-45fe-8bca-97eb14199dad%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/bfebad47-fd6d-45fe-8bca-97eb14199dad%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/qwLNX9SXnkY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM8LaaFdzazyaNrfWV8wRydduNX57kFU2w_6pw5-O2Gabg%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAGCwEM8LaaFdzazyaNrfWV8wRydduNX57kFU2w_6pw5-O2Gabg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAO-nZReVzKdt050ZU288wvC9mfNC6LUQgKRx%3DSY7vq0k1xVO%2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4