Boolean must/should where field is optional in some documents


(Nick Dunn) #1

I can't quite get my head around a specific search requirement I have, so I
hope someone can help.

My documents have a field "published" with a string value "yes" or "no"
(not boolean true/false, indexed data out of my control at present!). This
defines whether the document has been published or not, obviously. I have a
fairly large boolean query of a series of "must" sub-queries, of which one
is on this published field, ensuring that only those "yes" are returned.

However I've just realised that this field is optional in the document.
If it doesn't have a yes/no value, then the assumption is that the document
is published ("yes") and should be returned. So in essence I want to add a
must clause for this field only when it exists and has a value. Is this
even possible? Should I perhaps use a filter here?

Any ideas gratefully received.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #2

You can work around your problem by using the exists and/or missing filter.

If the default for the published field is "yes, then you can use an Or
filter with one clause being a term filter for "yes" and the other a
missing filter.

http://www.elasticsearch.org/guide/reference/query-dsl/missing-filter/
http://www.elasticsearch.org/guide/reference/query-dsl/exists-filter/

--
Ivan

On Tue, May 28, 2013 at 2:31 AM, Nick Dunn nick@nick-dunn.co.uk wrote:

I can't quite get my head around a specific search requirement I have, so
I hope someone can help.

My documents have a field "published" with a string value "yes" or "no"
(not boolean true/false, indexed data out of my control at present!). This
defines whether the document has been published or not, obviously. I have a
fairly large boolean query of a series of "must" sub-queries, of which one
is on this published field, ensuring that only those "yes" are returned.

However I've just realised that this field is optional in the document.
If it doesn't have a yes/no value, then the assumption is that the document
is published ("yes") and should be returned. So in essence I want to add a
must clause for this field only when it exists and has a value. Is this
even possible? Should I perhaps use a filter here?

Any ideas gratefully received.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Javier) #3

Hi, I found this old posting but it is now relevant with the new Boolean Queries in es 2.x. I have a similar situation where I need to search for documents where a specific field may be missing, but if it is present, then it needs to have an specific value. My old query using OR (deprecated) looks like this:

{
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "(title:mykeyword body:mykeyword)"
        }
      },
      "filter": {
        "or": {
          "filters": [
            {
              "missing": {
                "field": "bookId"
              }
            },
            {
              "term": {
                "bookId": 2
              }
            }
          ]
        }
      }
    }
  }
}

How can I replicate this same behavior with a boolean query that does not have OR clauses?

Thanks a lot!

Javier C.


(Javier) #4

Update: I tried the following, but it does not work ;(

{
  "bool" : {
    "must" : {
      "query_string" : {
        "query": "(title:mykeyword body:mykeyword)"
      }
    },
    "filter" : {
      "bool" : {
        "must" : {
          "term" : {
            "bookId" : 2
          }
        },
        "must_not" : {
          "exists" : {
            "field" : "bookId"
          }
        }
      }
    }
  }
}

(Camilo Sierra) #5

hope it helps !

  {
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "(title:mykeyword body:mykeyword)"
        }
      },
      "filter": {
        "bool": {
          "should": [
            {
              "missing": {
                "field": "bookId"
              }
            },
            {
              "term": {
                "bookId": 2
              }
            }
          ]
        }
      }
    }
  }
}

(Javier) #6

Thanks @Camilo_Sierra but the "missing" query is also deprecated in es 2.2 (Deprecated in 2.2.0. Use exists query inside a must_not clause instead.) So, based on your example, I tried something like this:

{
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "(title:mykeyword body:mykeyword)"
        }
      },
      "filter": {
        "bool": {
          "should": [
            {
              "must_not": {
                "exists": {
                  "field": "bookId"
                }
              }
            },
            {
              "term": {
                "bookId": 2
              }
            }
          ]
        }
      }
    }
  }
}

and it gave me this error:

QueryParsingException[No query registered for [must_not]]

Thanks again!


(Camilo Sierra) #7

need to use bool :wink:

{
	"query": {
		"bool": {
			"must": {
				"query_string": {
					"query": "(title:mykeyword body:mykeyword)"
				}
			},
			"filter": {
				"bool": {
					"should": [{
						"bool": {
							"must_not": {
								"exists": {
									"field": "bookId"
								}
							}
						}
					}, {
						"term": {
							"bookId": 2
						}
					}]
				}
			}
		}
	}
}

(Javier) #8

And... it worked! :slightly_smiling:

Thanks a lot!!!

Javier C.


(Camilo Sierra) #9

awesome!!


(system) #10