How to: Elastic Search - Efficient/Compact “Not In” Operator

So I wanted the equivalent of doing an sql not in query such as this:

select * from test where id not in (1,2,3);
Initially I was thinking more in SQL, and it took me an hour to understand the philosophy behind Elasticsearch, the query builder, and the underlying rest API.

I have come up with the following which however seems very inefficient:

public <T> MyQueryBuilder notIn(String field, List<String> values) {
        if (values != null & values.size() > 0) {

            Iterator<String> it = values.iterator();

            while(it.hasNext()){
                MatchQueryBuilder match =  new MatchQueryBuilder(field, it.next());
                match.operator(Operator.OR);
                boolQueryBuilder.mustNot(match);
            }
        }
        return this;
    }

And this generates the following rest call body:

{
  "bool" : {
    "must_not" : [ {
      "match" : {
        "myId" : {
          "query" : "b5359d78-5e0e-4b09-af2c-d4c921ea6a15",
          "operator" : "OR"
        }
      }
    }, {
      "match" : {
        "myId" : {
          "query" : "e679ce6d-a7a1-48a0-b06e-54752fbd7e31",
          "operator" : "OR"
        }
      }
    } ]
  }
}

So is there a better more efficient way? Update: Efficient might have been the wrong word. A more concise/compact expression....

Thanks!

Update

I also tried the following but it does not work like an sql not in expression:

{ "query" : {
"bool" : {
"must_not" : {
"bool": {
"filter": [
{
"terms": {
"myId": [
"c1928da6-80d0-475a-9025-b3ebd934a576" ,
"d2",
"d3",
"d4"
]
}
}
]
}
},
"must_not" : {
"match" : {
"myId" :
}
}
}
}
}
Update 2

I think the following actually works, and works better:

{    "query" : {
        "bool" : {
            "must_not" : {
                      "bool": {
                            "filter": [
                              {
                                "terms": {
                                  "myId": [
                                     "c1928da6-80d0-475a-9025-b3ebd934a576" ,
                                    "d2",
                                    "d3",
                                    "d4"
                                  ]
                                }
                              }
                            ]
                          }
            }
        }
    }
}

This does not seem to work even though the notation seems to be more concise .

public <T> CoolQueryBuilder notInV2(String field, List<String> values) {
    if (values != null & values.size() > 0) {
        //boolQuery() = constructor for boolean query
        boolQueryBuilder.mustNot(boolQuery().filter(termsQuery(field, values)));
    }
    return this;
}

Hi @menelaosbgr,

You don't need that inner filter, you can use must_not only. People coming from SQL also usually disregard the fact that fields are analyzed by default in elasticsearch. To filter ID probably you might be interested in use keyword type in your field.

The complete example:

//create index mapping myId of type keyword
PUT not_in_example
{
  "mappings": {
    "doc": {
      "properties": {
        "myId": {
          "type": "keyword",
          "ignore_above": 256
        }
      }
    }
  }
}

POST not_in_example/doc/1
{
  "myId": "d0"
}

POST not_in_example/doc/2
{
  "myId": "d2"
}

POST not_in_example/doc/3
{
  "myId": "c1928da6-80d0-475a-9025-b3ebd934a576"
}

Search/Result:

GET not_in_example/_search
{
  "query": {
    "bool": {
      "must_not": {
        "terms": {
          "myId": [
            "c1928da6-80d0-475a-9025-b3ebd934a576",
            "d2",
            "d3",
            "d4"
          ]
        }
      }
    }
  }
}

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "not_in_example",
        "_type": "doc",
        "_id": "1",
        "_score": 1,
        "_source": {
          "myId": "d0"
        }
      }
    ]
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.