Elasticsearch for geo-aware Wikipedia / OSM place search?


#1

I would like to implement a search-backend which returns a location-aware search results of all Wikipedia / OSM place names, while recognising place names in all languages.

Thus, if you search for "Vienna" or "Wien" from Europe it returns the location of Vienna, Austria, but if you do the same search from within the US, next to one of the many cities called "Vienna", that smaller city might appear above the EU one.

So far, I believe it needs to do the following:

  1. Be geo-aware, thus for each search result it should return it's distance from a query point.

  2. Handle the dozens of alternative-names of places in a smart way. For example "isafjo" should match "Ísafjörður" in an autocomplete.

  3. Cache/index in a way that search results can be provided near real-time, thus allowing an autocomplete experience on the client side.

Would that be a possible / recommended / straightforward application of ElasticSearch? The database would be almost totally read-only, with write operations maybe only once a month.


(Mark Walkom) #2

It's definitely possible, but it would involve a bit of leg work with all those requirements.


(Mark Harwood) #3

Check out Pelias from the guys at Mapzen.com
They use geonames data and elasticsearch and is all open source ( bankrolled by Samsung - see https://mapzen.com/about )


#4

Thanks. I've checked it out, and although their demo didn't work for any search term I tried and it has UTF-8 bugs, at least it is an open source project.

The reference I found is geonames.org's REST API, it's super reliable and has a great fuzzy matching for names. I've found an old slide mentioning that they are using Lucene, so I guess implementing it in ElasticSearch today would be the right choice.


(system) #5