Filter/sort by location when migrate from 1.7 to 2.4


(Tomas Stefano) #1

Hello all,

I am trying to upgrade an elastic search app from 1.7.X to 2.4 version, but I am not able to reproduce the same search results on the new version.

First thing that I identify is the sort by location is not working properly. I wonder if it is related to the query itself or the sort node should be in a different way. Or it is a bug or my miss interpretation from the docs.

Structure

This is the structure of the elastic search:

The file @elastic_search_mapping:

{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "firm_name_analyzer": { "tokenizer": "whitespace", "filter": "lowercase" }
        }
      }
    }
  },
  "mappings": {
    "firms": {
      "properties": {
        "registered_name": { "type": "string", "analyzer": "firm_name_analyzer" },
        "postcode_searchable": { "type": "boolean" },

        "advisers": {
          "type": "nested",
          "properties": {
            "location": { "type": "geo_point" },
            "range_location": { "type": "geo_shape" },
            "range": { "type": "integer" },
            "name": { "type": "string", "index": "not_analyzed" }
          }
        }
      }
    }
  }
}

Run with:

curl -XPOST http://127.0.0.1:9200/rad_development -d @elastic_search_mapping.json

Old Elastic search 1.7 query

This works great with elastic search 1.7. The sort by location worked fine out of
the filter/query terms too. Here is the query:

curl -XPOST http:localhost:9200/rad_development/firms/_search?from=0' -d '
{
   "sort":[
      {
         "_geo_distance":{
            "advisers.location":[
               -0.1647512,
               51.548809
            ],
            "order":"asc",
            "unit":"miles"
         }
      },
      "registered_name"
   ],
   "query":{
      "filtered":{
         "filter":{
            "bool":{
               "must":[

               ]
            }
         },
         "query":{
            "bool":{
               "must":[
                  {
                     "match":{
                        "postcode_searchable":true
                     }
                  },
                  {
                     "nested":{
                        "path":"advisers",
                        "filter":{
                           "bool":{
                              "must":{
                                 "geo_shape":{
                                    "range_location":{
                                       "relation":"intersects",
                                       "shape":{
                                          "type":"point",
                                          "coordinates":[
                                             -0.1647512,
                                             51.548809
                                          ]
                                       }
                                    }
                                 }
                              },
                              "should":{
                                 "geo_distance":{
                                    "distance":"750miles",
                                    "location":[
                                       -0.1647512,
                                       51.548809
                                    ]
                                 }
                              }
                           }
                        }
                     }
                  }
               ]
            }
         }
      }
   }
}'

Returns the data filtered and sorted by location as expected:

Trying to return the same result as the old elastic search version

This is my attempt to migrate the old query above to elastic search 2.4 compatible. This query returns results but not exactly the ones expcted. It appears that the sort by distance is not working or the filter is not being applied properly. Anyway this my attempt:

curl -XPOST http://localhost:9200/rad_development/firms/_search?from=0 -d '
{
  "sort":[
    {
       "_geo_distance":{
         "advisers.location":[
           -0.1647512,
           51.548809
         ],
         "order":"asc",
         "unit":"mi"
      }
    }
  ],
  "query":{
     "bool":{
       "must":[
         {
           "match":{
             "postcode_searchable":true
           }
         },
         {
           "nested":{
             "path":"advisers",
               "filter":{
                 "bool":{
                   "must":{
                      "geo_shape":{
                         "advisers.range_location":{
                            "relation":"intersects",
                            "shape":{
                               "type":"point",
                               "coordinates":[
                                  -0.1647512,
                                  51.548809
                               ]
                            }
                         }
                      }
                   },
                   "should":{
                     "geo_distance":{
                       "distance":"750miles",
                       "advisers.location":[
                          -0.1647512,
                          51.548809
                       ]
                     }
                   }
                 }
               }
           }
         }
       ]
     }
  }
}

Anyone has some thoughts or point some direction on why the sort by location on the new version is not working at all? I followed the sort by location docs but perhaps I am missing something...

Many thanks!


(Ali Beyad) #2

Hello,

Without knowing what your data is (even a small sample), its really hard to say. Also, what results were you getting in 1.7 and how did they differ in 2.4?

The default geo_shape parameters in 1.x were not very good. The mapping defaulted to using the geo_hash prefix tree with a lot of slop, which ended up producing a lot of false positives. Are you sure the results you were getting in 1.7 aren't false positives?


(Tomas Stefano) #3

Hi, thank you for the reply.

The results from 1.7 returns sorted by location (on the small data sample it is sort by the nearest ones and so on) and from the 2.4 they return the results in random locations (it returns random order - some far away, some near ...).

But even if the 1.7 supposedly return false positives (that I think it is not the case) the 2.4 query are not returning sorted at all.

I would confess that the explain query was not so helpful to understand what it is wrong. Perhaps it is my low knowledge on elastic search.


(Ali Beyad) #4

Do you have a small sample data set with the improperly sorted results that you can provide? Otherwise its really hard to know how ES sorted differently than you expected.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.