Geo_shape api changes on trunk (issue 2720)

I just updated my trunk snapshot to something post beta1 (I had been
working with a pre-beta1 snapshot build for a month) and noticed that my
geo_shape related tests are breaking because support for 'within'
(previously known as 'contains') has been removed. I get the following
error: QueryParsingException[[test] Unsupported shape operation [within].
Only [intersects] operation is supported].

The above ticket suggests that there are some changes happening related to
syncing up with lucene. Not a big deal for me at this point and I can
probably fix this but since I am trying to build a product using this
feature, I am sort of curious what else is coming. So, are there any other
geo_shape related changes coming that I should be aware of?

Also, the documentation for geo_shape is wrong for both the last release
and the current snapshot. I missed some pages for my recent change to fix
the contains to within, which is now broken as well. Probably there should
be a warning in there about the coming API change that only allows
intersects.

Jilles

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Jilles,

Yes, indeed we remove the "within" relation fro geo_shap queries. This
feature was experimental and we realized that it didn't work as good as we
hoped it would (was very inaccurate in many simple cases). So we "fell
back" to only support the relations that Lucene currently supports
out-of-the-box which is basically "intersects" ("disjoint" relations can
also be supported by wrapping "intersects" in a bool must_not clause). We
also exposed the spatial strategy to be configured in the mappings
(previously we only used our own version of the
TermQueryPrefixTreeStrategy) so people will be able to play around with it
and choose the best strategy that works for them. To make debugging easier
and since the only strategies that we support are prefix based (which all
uses the same indexing logic) we also enable customizing the strategy at
query time (so you won't need to reindex the documents to test the
different strategies).

Our default strategy now is "recursive" which has it's own dedicated filter
(the default used to be "term" strategy which internally used term
filters... it just seems too inaccurate).

As for the future, we're investigating the options to enhance the term
filter to do a better job... if we find that it's good enough, we might
bring back support for "within" relation, but for now we removed it.

We're currently working on changing the documentation to fit the latest
update.

Other planned geo related work, we plan to extend the geo_distance filter
and introduce a new geo_circle mapping type. This will enable indexing
documents which are associated with a circle shape and you'll be able to
run queries such as: "find all documents which contain point [lat,lon]"

Other than that, we're monitoring the work on Lucene and plan to always
expose as much of lucene support as possible (at least the parts that makes
sense)

On Monday, March 4, 2013 11:51:28 AM UTC+1, Jilles van Gurp wrote:

https://github.com/elasticsearch/elasticsearch/issues/2720

I just updated my trunk snapshot to something post beta1 (I had been
working with a pre-beta1 snapshot build for a month) and noticed that my
geo_shape related tests are breaking because support for 'within'
(previously known as 'contains') has been removed. I get the following
error: QueryParsingException[[test] Unsupported shape operation [within].
Only [intersects] operation is supported].

The above ticket suggests that there are some changes happening related to
syncing up with lucene. Not a big deal for me at this point and I can
probably fix this but since I am trying to build a product using this
feature, I am sort of curious what else is coming. So, are there any other
geo_shape related changes coming that I should be aware of?

Also, the documentation for geo_shape is wrong for both the last release
and the current snapshot. I missed some pages for my recent change to fix
the contains to within, which is now broken as well. Probably there should
be a warning in there about the coming API change that only allows
intersects.

Jilles

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

OK thanks for the update. What you say makes sense. Anyway, my tests work
fine using intersect now so the change wasn't that big for me.

The main issue with intersects is that it also returns things that are only
partially contained. One of the things that I actually would like to have
is a combination of the geo_circle and geo_shape behavior: I'd like to
query within a polygon for aribitrary shapes and then rank them based on
their distance to a specific coordinate inside the polygon. I'm currently
actually playing with open street map data, which has lots of different
shapes for pois, roads, administrative areas, buildings, forests, etc. I've
been wondering about strategies for indexing and searching through this.

Regarding the mapping type, it would actually be helpful to have just one
type instead of several types for different use cases. It seems to me, that
if you can do the above for polygons, doing it for circles is just a
variation of the same problem (simply convert the circle to a polygon).

Jilles

On Monday, March 4, 2013 1:12:26 PM UTC+1, uboness wrote:

Hi Jilles,

Yes, indeed we remove the "within" relation fro geo_shap queries. This
feature was experimental and we realized that it didn't work as good as we
hoped it would (was very inaccurate in many simple cases). So we "fell
back" to only support the relations that Lucene currently supports
out-of-the-box which is basically "intersects" ("disjoint" relations can
also be supported by wrapping "intersects" in a bool must_not clause). We
also exposed the spatial strategy to be configured in the mappings
(previously we only used our own version of the
TermQueryPrefixTreeStrategy) so people will be able to play around with it
and choose the best strategy that works for them. To make debugging easier
and since the only strategies that we support are prefix based (which all
uses the same indexing logic) we also enable customizing the strategy at
query time (so you won't need to reindex the documents to test the
different strategies).

Our default strategy now is "recursive" which has it's own dedicated
filter (the default used to be "term" strategy which internally used term
filters... it just seems too inaccurate).

As for the future, we're investigating the options to enhance the term
filter to do a better job... if we find that it's good enough, we might
bring back support for "within" relation, but for now we removed it.

We're currently working on changing the documentation to fit the latest
update.

Other planned geo related work, we plan to extend the geo_distance filter
and introduce a new geo_circle mapping type. This will enable indexing
documents which are associated with a circle shape and you'll be able to
run queries such as: "find all documents which contain point [lat,lon]"

Other than that, we're monitoring the work on Lucene and plan to always
expose as much of lucene support as possible (at least the parts that makes
sense)

On Monday, March 4, 2013 11:51:28 AM UTC+1, Jilles van Gurp wrote:

https://github.com/elasticsearch/elasticsearch/issues/2720

I just updated my trunk snapshot to something post beta1 (I had been
working with a pre-beta1 snapshot build for a month) and noticed that my
geo_shape related tests are breaking because support for 'within'
(previously known as 'contains') has been removed. I get the following
error: QueryParsingException[[test] Unsupported shape operation [within].
Only [intersects] operation is supported].

The above ticket suggests that there are some changes happening related
to syncing up with lucene. Not a big deal for me at this point and I can
probably fix this but since I am trying to build a product using this
feature, I am sort of curious what else is coming. So, are there any other
geo_shape related changes coming that I should be aware of?

Also, the documentation for geo_shape is wrong for both the last release
and the current snapshot. I missed some pages for my recent change to fix
the contains to within, which is now broken as well. Probably there should
be a warning in there about the coming API change that only allows
intersects.

Jilles

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Jilles,
If you are relying strongly on Elasticsearch geo_shape queries being
accurate, please be aware that even the "intersects" can return false
positives. The error rate is much smaller than "within" had been, but it
still happens: https://github.com/elasticsearch/elasticsearch/issues/2361

In my application, I end up post-filtering the candidates returned by
Elasticsearch against the query. It's not efficient at all (would be more
efficient if done server-side in Elasticsearch using a WKB representation)
but necessary for our purposes.

Jeff

On Monday, March 4, 2013 5:22:45 AM UTC-8, Jilles van Gurp wrote:

OK thanks for the update. What you say makes sense. Anyway, my tests work
fine using intersect now so the change wasn't that big for me.

The main issue with intersects is that it also returns things that are
only partially contained. One of the things that I actually would like to
have is a combination of the geo_circle and geo_shape behavior: I'd like to
query within a polygon for aribitrary shapes and then rank them based on
their distance to a specific coordinate inside the polygon. I'm currently
actually playing with open street map data, which has lots of different
shapes for pois, roads, administrative areas, buildings, forests, etc. I've
been wondering about strategies for indexing and searching through this.

Regarding the mapping type, it would actually be helpful to have just one
type instead of several types for different use cases. It seems to me, that
if you can do the above for polygons, doing it for circles is just a
variation of the same problem (simply convert the circle to a polygon).

Jilles

On Monday, March 4, 2013 1:12:26 PM UTC+1, uboness wrote:

Hi Jilles,

Yes, indeed we remove the "within" relation fro geo_shap queries. This
feature was experimental and we realized that it didn't work as good as we
hoped it would (was very inaccurate in many simple cases). So we "fell
back" to only support the relations that Lucene currently supports
out-of-the-box which is basically "intersects" ("disjoint" relations can
also be supported by wrapping "intersects" in a bool must_not clause). We
also exposed the spatial strategy to be configured in the mappings
(previously we only used our own version of the
TermQueryPrefixTreeStrategy) so people will be able to play around with it
and choose the best strategy that works for them. To make debugging easier
and since the only strategies that we support are prefix based (which all
uses the same indexing logic) we also enable customizing the strategy at
query time (so you won't need to reindex the documents to test the
different strategies).

Our default strategy now is "recursive" which has it's own dedicated
filter (the default used to be "term" strategy which internally used term
filters... it just seems too inaccurate).

As for the future, we're investigating the options to enhance the term
filter to do a better job... if we find that it's good enough, we might
bring back support for "within" relation, but for now we removed it.

We're currently working on changing the documentation to fit the latest
update.

Other planned geo related work, we plan to extend the geo_distance filter
and introduce a new geo_circle mapping type. This will enable indexing
documents which are associated with a circle shape and you'll be able to
run queries such as: "find all documents which contain point [lat,lon]"

Other than that, we're monitoring the work on Lucene and plan to always
expose as much of lucene support as possible (at least the parts that makes
sense)

On Monday, March 4, 2013 11:51:28 AM UTC+1, Jilles van Gurp wrote:

https://github.com/elasticsearch/elasticsearch/issues/2720

I just updated my trunk snapshot to something post beta1 (I had been
working with a pre-beta1 snapshot build for a month) and noticed that my
geo_shape related tests are breaking because support for 'within'
(previously known as 'contains') has been removed. I get the following
error: QueryParsingException[[test] Unsupported shape operation [within].
Only [intersects] operation is supported].

The above ticket suggests that there are some changes happening related
to syncing up with lucene. Not a big deal for me at this point and I can
probably fix this but since I am trying to build a product using this
feature, I am sort of curious what else is coming. So, are there any other
geo_shape related changes coming that I should be aware of?

Also, the documentation for geo_shape is wrong for both the last release
and the current snapshot. I missed some pages for my recent change to fix
the contains to within, which is now broken as well. Probably there should
be a warning in there about the coming API change that only allows
intersects.

Jilles

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thanks, good to know. In my case, I don't need a 100% accuracy and can deal
with some error margins for things that are just outside or inside the
polygon.

I'm more worried about the intersects behavior for small polygons (e.g. a
building) not intersecting with polygons that fully contain them (e.g. the
neighborhood). See yesterdays "search point in polygon" thread for details
on that. Basically intersects only works if the boundaries cross each
other. This seems very wrong to me.

Jilles

On Tuesday, March 5, 2013 1:07:44 AM UTC+1, Jeffrey Gerard wrote:

Hi Jilles,
If you are relying strongly on Elasticsearch geo_shape queries being
accurate, please be aware that even the "intersects" can return false
positives. The error rate is much smaller than "within" had been, but it
still happens: https://github.com/elasticsearch/elasticsearch/issues/2361

In my application, I end up post-filtering the candidates returned by
Elasticsearch against the query. It's not efficient at all (would be more
efficient if done server-side in Elasticsearch using a WKB representation)
but necessary for our purposes.

Jeff

On Monday, March 4, 2013 5:22:45 AM UTC-8, Jilles van Gurp wrote:

OK thanks for the update. What you say makes sense. Anyway, my tests work
fine using intersect now so the change wasn't that big for me.

The main issue with intersects is that it also returns things that are
only partially contained. One of the things that I actually would like to
have is a combination of the geo_circle and geo_shape behavior: I'd like to
query within a polygon for aribitrary shapes and then rank them based on
their distance to a specific coordinate inside the polygon. I'm currently
actually playing with open street map data, which has lots of different
shapes for pois, roads, administrative areas, buildings, forests, etc. I've
been wondering about strategies for indexing and searching through this.

Regarding the mapping type, it would actually be helpful to have just one
type instead of several types for different use cases. It seems to me, that
if you can do the above for polygons, doing it for circles is just a
variation of the same problem (simply convert the circle to a polygon).

Jilles

On Monday, March 4, 2013 1:12:26 PM UTC+1, uboness wrote:

Hi Jilles,

Yes, indeed we remove the "within" relation fro geo_shap queries. This
feature was experimental and we realized that it didn't work as good as we
hoped it would (was very inaccurate in many simple cases). So we "fell
back" to only support the relations that Lucene currently supports
out-of-the-box which is basically "intersects" ("disjoint" relations can
also be supported by wrapping "intersects" in a bool must_not clause). We
also exposed the spatial strategy to be configured in the mappings
(previously we only used our own version of the
TermQueryPrefixTreeStrategy) so people will be able to play around with it
and choose the best strategy that works for them. To make debugging easier
and since the only strategies that we support are prefix based (which all
uses the same indexing logic) we also enable customizing the strategy at
query time (so you won't need to reindex the documents to test the
different strategies).

Our default strategy now is "recursive" which has it's own dedicated
filter (the default used to be "term" strategy which internally used term
filters... it just seems too inaccurate).

As for the future, we're investigating the options to enhance the term
filter to do a better job... if we find that it's good enough, we might
bring back support for "within" relation, but for now we removed it.

We're currently working on changing the documentation to fit the latest
update.

Other planned geo related work, we plan to extend the geo_distance
filter and introduce a new geo_circle mapping type. This will enable
indexing documents which are associated with a circle shape and you'll be
able to run queries such as: "find all documents which contain point
[lat,lon]"

Other than that, we're monitoring the work on Lucene and plan to always
expose as much of lucene support as possible (at least the parts that makes
sense)

On Monday, March 4, 2013 11:51:28 AM UTC+1, Jilles van Gurp wrote:

https://github.com/elasticsearch/elasticsearch/issues/2720

I just updated my trunk snapshot to something post beta1 (I had been
working with a pre-beta1 snapshot build for a month) and noticed that my
geo_shape related tests are breaking because support for 'within'
(previously known as 'contains') has been removed. I get the following
error: QueryParsingException[[test] Unsupported shape operation [within].
Only [intersects] operation is supported].

The above ticket suggests that there are some changes happening related
to syncing up with lucene. Not a big deal for me at this point and I can
probably fix this but since I am trying to build a product using this
feature, I am sort of curious what else is coming. So, are there any other
geo_shape related changes coming that I should be aware of?

Also, the documentation for geo_shape is wrong for both the last
release and the current snapshot. I missed some pages for my recent change
to fix the contains to within, which is now broken as well. Probably there
should be a warning in there about the coming API change that only allows
intersects.

Jilles

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.