I am using Elasticsearch to search for events. Each has an Address, City, State, Zipcode. I was using a filed in the Mappings of the event called formatted_address, that would be the whole address with city and state and zipcode combined in one field and I was just doing a match to that field. But I soon found out that searching for:
319 Rebel Ridge, Hemphill, Texas
and searching for
319 Rebel Ridge, Hemphill, TX
Had different results, depending on how a user would type the address when creating the event.
So I started using Google Places to auto suggest the address when creating the event and when searching for the event to make it more consistent, but I am still running into this problem of different ways one address can be typed.
Which is the best way to create a mapping for address and be able to deal with different forms of address. Specially dealing with States abbreviations.
I was reading through the documentation and saw that cross-fields Queries are best used for this scenario
Instead of normalizing the address, you put it in different fields and do a match. This helps to customize the query for a field. For example, since the Texas is a state that can be represented by TX, you can use a synonym file or table that tells Texas and TX or the same. Check out this document for how to use synonyms, https://www.elastic.co/guide/en/elasticsearch/guide/current/using-synonyms.html
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.