Internal implementation details when using geo hash

Todd_Nine · November 14, 2014, 4:29pm

Hey All,
I have a question about the internal implementation of geo hashes and
distance filters. Here is my current understanding, I'm struggling to
figure out how to apply these to our queries internally in ES.

Using bool queries are very efficient. Internally they
perform bitmap union, intersection, and subtraction for very fast candidate
aggregation per term.

Geo distance filters are then run on the results of the candidates from the
bitmap logic. Each document must be evaluated individually in memory.
Obviously for large documents sets from the bitmap evaluation, this is
inefficient.

What happens when someone only gives our application a geo distance query?
To make this more efficient, I would like to use geo hashing. ES seems to
have geo hashing built in, but it's documented as filter. For instance, I
envision the following workflow internally in ES.

User searches for all matches within 2k of their current location
Use a geohash to create a hash that will encapsulate all points within
2k of their current location
Use the bool query with this geo hash to narrow the candidate result set
Apply the distance filter to these candidates to get more accurate
results.

However, when reading the documentation on searching geo hashing, it's
still a filter. Internally, does it use geohasing and the fast bitmaps
since it's a string match, then filter, or is it all filters and the hash
is evaluated in memory for all documents?

http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.4/query-dsl-geohash-cell-filter.html

Thanks,
Todd

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5ee6fb94-e0a2-4e39-8537-23c8c8f74fe0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rod_Simpson · November 19, 2014, 7:15pm

+1 any insights here would be most appreciated.

On Friday, November 14, 2014 9:29:54 AM UTC-7, Todd Nine wrote:

Hey All,
I have a question about the internal implementation of geo hashes and
distance filters. Here is my current understanding, I'm struggling to
figure out how to apply these to our queries internally in ES.

Using bool queries are very efficient. Internally they
perform bitmap union, intersection, and subtraction for very fast candidate
aggregation per term.

Geo distance filters are then run on the results of the candidates from
the bitmap logic. Each document must be evaluated individually in memory.
Obviously for large documents sets from the bitmap evaluation, this is
inefficient.

What happens when someone only gives our application a geo distance query?
To make this more efficient, I would like to use geo hashing. ES seems to
have geo hashing built in, but it's documented as filter. For instance, I
envision the following workflow internally in ES.

User searches for all matches within 2k of their current location

Use a geohash to create a hash that will encapsulate all points within
2k of their current location

Use the bool query with this geo hash to narrow the candidate result set

Apply the distance filter to these candidates to get more accurate
results.

However, when reading the documentation on searching geo hashing, it's
still a filter. Internally, does it use geohasing and the fast bitmaps
since it's a string match, then filter, or is it all filters and the hash
is evaluated in memory for all documents?

Elasticsearch Platform — Find real-time answers at scale | Elastic

Thanks,
Todd

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3170d1e3-47d4-422e-973e-d003b1380068%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Shawn_Feldman · November 19, 2014, 8:09pm

would also like to know...

On Friday, November 14, 2014 9:29:54 AM UTC-7, Todd Nine wrote:

Hey All,
I have a question about the internal implementation of geo hashes and
distance filters. Here is my current understanding, I'm struggling to
figure out how to apply these to our queries internally in ES.

Using bool queries are very efficient. Internally they
perform bitmap union, intersection, and subtraction for very fast candidate
aggregation per term.

Geo distance filters are then run on the results of the candidates from
the bitmap logic. Each document must be evaluated individually in memory.
Obviously for large documents sets from the bitmap evaluation, this is
inefficient.

What happens when someone only gives our application a geo distance query?
To make this more efficient, I would like to use geo hashing. ES seems to
have geo hashing built in, but it's documented as filter. For instance, I
envision the following workflow internally in ES.

User searches for all matches within 2k of their current location

Use a geohash to create a hash that will encapsulate all points within
2k of their current location

Use the bool query with this geo hash to narrow the candidate result set

Apply the distance filter to these candidates to get more accurate
results.

However, when reading the documentation on searching geo hashing, it's
still a filter. Internally, does it use geohasing and the fast bitmaps
since it's a string match, then filter, or is it all filters and the hash
is evaluated in memory for all documents?

Elasticsearch Platform — Find real-time answers at scale | Elastic

Thanks,
Todd

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a380f53e-c1d7-41f8-9daa-9f74e2bb54b2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

George_Reyes · November 21, 2014, 3:34pm

+1
Yeah, having geo hashing done internally would be awesome. Kinda confused
since it does only seem to be documented as a filter so any answer would be
helpful.

Thanks!

On Friday, November 14, 2014 8:29:54 AM UTC-8, Todd Nine wrote:

Hey All,
I have a question about the internal implementation of geo hashes and
distance filters. Here is my current understanding, I'm struggling to
figure out how to apply these to our queries internally in ES.

Using bool queries are very efficient. Internally they
perform bitmap union, intersection, and subtraction for very fast candidate
aggregation per term.

Geo distance filters are then run on the results of the candidates from
the bitmap logic. Each document must be evaluated individually in memory.
Obviously for large documents sets from the bitmap evaluation, this is
inefficient.

What happens when someone only gives our application a geo distance query?
To make this more efficient, I would like to use geo hashing. ES seems to
have geo hashing built in, but it's documented as filter. For instance, I
envision the following workflow internally in ES.

User searches for all matches within 2k of their current location

Use a geohash to create a hash that will encapsulate all points within
2k of their current location

Use the bool query with this geo hash to narrow the candidate result set

Apply the distance filter to these candidates to get more accurate
results.

However, when reading the documentation on searching geo hashing, it's
still a filter. Internally, does it use geohasing and the fast bitmaps
since it's a string match, then filter, or is it all filters and the hash
is evaluated in memory for all documents?

Elasticsearch Platform — Find real-time answers at scale | Elastic

Thanks,
Todd

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/abfc6621-3035-48d4-b7ec-09ec190f65f2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Geospatial implementation Elasticsearch	14	478	July 6, 2017
ElasticSearch geo search vs Solr geo search performance? Elasticsearch	4	1787	July 6, 2017
Performance of Spatial Query Elasticsearch	1	752	July 6, 2017
Geo distance and bounding box Elasticsearch	5	637	July 6, 2017
Geohash_cell filter Elasticsearch	7	383	July 6, 2017

Internal implementation details when using geo hash

Related topics