Query DSL and scoring is not working


(pungent) #1

My cluster id is cabdbc

Have been trying to write a DSL query, but not able to get it work as expected. I am trying to find those documents, which matched following criteria in my ElasticSearch 2.3 storage and sort high to low score:

  1. rents.start = "2016-09"
    AND
  2. rents.end = "2016-10"
    AND
  3. rents.abs is between 200000 and 350000
    AND
  4. los.start = "2016-09"
    AND
  5. los.max >= 4
    AND
  6. bedroom_count >= 2 and <=20 <---for more than 2 rooms, per extra room I want to cut (penalize) the relevance score by 8point. E.g. if a document has bedroom_count=5, then score will be reduced by (5-2)*8
    AND
  7. parking contains street word

I believe from #1 to #6 should be part of filters while #6 and #7 will be part of the query.

Document mapping is:

{
  "properties": {
    "bedroom_count": {"type": "integer"},
    "parking": {"type": "string"},
    "rents": {
      "type": "nested",
      "dynamic": "strict",
      "properties": {
        "start": { "type": "date",   "format": "yyyy-MM"  },
        "end":   { "type": "date",   "format": "yyyy-MM"  },
        "abs": { "type": "integer"                      },                    
      },                      
    "los": {                      
      "type": "nested",                     
      "dynamic": "strict",                      
      "properties": {                     
        "start":  { "type": "date",   "format": "yyyy-MM" },
        "max":    { "type": "integer"                     },
        "min":    { "type": "integer"                     }
      }                     
    }
  }
}

A sample doc:

{
  "bedroom_count": 4,
  "parking": "on the street",
  "rents": [
    {
      "abs": 250000,
      "start": "2016-09",
      "end": "2016-09"
    },
    {
      "abs": 250000,
      "start": "2016-09",
      "end": "2016-10"
    },
    {
      "abs": 250000,
      "start": "2016-09",
      "end": "2016-11"
    }
  ],
  "los": [
    {
      "min": 1,
      "max": 12,
      "start": "2016-09"
    },
    {
      "min": 1,
      "max": 12,
      "start": "2016-10"
    },
    {
      "min": 1,
      "max": 12,
      "start": "2016-11"
    }
  ]
}

And this is what I tried: (please remember filtered is deprecated in 2.x ref. https://www.elastic.co/blog/better-query-execution-coming-elasticsearch-2-0 )

"query": {
  "bool": {
    "filter": {
      "nested": {
        "path": "rents",
        "score_mode": "none",
        "query": {
          "bool": {
            "must": [
              {
                "range": {
                  "rents.start": {"gte": "2016-09", "lte": "2016-09", } 
                }
              },
              {
                "range": {
                  "rents.end": {"gte": "2016-10", "lte": "2016-10", } 
                }
              },
              {
                "range": {
                  "rents.abs": {"gte": 200000, "lte": 350000}
                }
              },
              {
                "range": {
                  "los.start": {"gte": "2016-09", "lte": "2016-09"}
                }
              },
              {
                "range": {
                  "los.max": {"gte": 4}
                }
              }
            ]
          }
        }
      },
      "bool": {
        "range": {
          "bedroom_count": {
            "gte": 2, "lte": 20, "boost": 8
          }
        }
      }
    },
    "must": [
      {
        "match": { "parking":"street"}
      },
      {
        "range": {
          "bedroom_count": {
            "gte": 2, "lte": 20, "boost": 8
          }
        }
      }
    ]
  }
}

As you expected I am doing something wrong here. Hence it is not working at all. Any idea how to modify above query to get it work?

I get this error

{
   "error": {
      "root_cause": [
         {
            "type": "parse_exception",
            "reason": "Failed to derive xcontent"
         }
      ],
      "type": "search_phase_execution_exception",
      "reason": "all shards failed",
      "phase": "query_fetch",
      "grouped": true,
      "failed_shards": [
         {
            "shard": 0,
            "index": "listings_v1",
            "node": "TopkjPF_S9ueyzpGbF6epg",
            "reason": {
               "type": "parse_exception",
               "reason": "Failed to derive xcontent"
            }
         }
      ]
   },
   "status": 400
}

(Mark Harwood) #2

Your nested clause is too high in the query tree.
Once you declare a nested part of a query all subclauses must pertain to the same nested object.
In your example you have a single nested clause but mix criteria from nested rents and los objects where clearly a single object cannot satisfy all of the clauses you have mixed up in there.
It needs to be: (psuedo code)

bool
  must
    nested
       bool
          must
             rent criteria 1                
             rent criteria 2                
    nested
       bool
          must
             los criteria 1                
             los criteria 2

(pungent) #3

Thanks @Mark_Harwood . I got it worked, after implementing your suggestion!!


(system) #4

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.