Elasticsearch bool query formation with multiple must clause


(Milind Karandikar) #1

I have a query like the following -

 {
    "query": {
        "bool": {
            "must": {
                "bool" : { "should": [
                      { "match": { "camp_id": "Elasticsearch" }},
                      { "match": { "camp_id": "Solr" }} ] }
            },
            "must": { 
                "bool" : { "should": [
                      { "match": { "ad_id": "Elastic" }},
                      { "match": { "ad_id": "dummy" }} ] }
            },
            "must_not": { "match": {"authors": "radu gheorge" }},
            .....
            .....     
        }
    }
    }

In short, (camp_id = 'elasticsearch' or camp_id = 'solr') AND (ad_id = 'elasticsearch' or ad_id = 'solr') ....

After good amount of research, I wrote the following java code -

    final SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

    final BoolQueryBuilder finalBoolQuery = new BoolQueryBuilder();

    BoolQueryBuilder campaignBoolQuery = null;
    if (campaignIds != null) {
        campaignBoolQuery = QueryBuilders.boolQuery();
        for (int campaignId : campaignIds) {
            campaignBoolQuery.should(QueryBuilders.matchQuery("camp_id", campaignId));
        }
    }

    BoolQueryBuilder creativeBoolQuery = null;
    if (creativeIds != null) {
        creativeBoolQuery = QueryBuilders.boolQuery();
        for (int creativeId : creativeIds) {
            creativeBoolQuery.should(QueryBuilders.matchQuery("ad_id", creativeId));
        }
    }

    finalBoolQuery.must(campaignBoolQuery);
    finalBoolQuery.must(creativeBoolQuery);
    searchSourceBuilder.query(finalBoolQuery).size(9999);

    System.out.println(searchSourceBuilder.toString());

With the above code, I expected that I would have 1 must clause for 'camp_id' and another 1 for 'ad_id' but following is what I got -

    {
  "size" : 9999,
  "query" : {
    "bool" : {
      "must" : [
        {
          "bool" : {
            "should" : [
              {
                "match" : {
                  "camp_id" : {
                    "query" : 1,
                    "operator" : "OR",
                    "prefix_length" : 0,
                    "max_expansions" : 50,
                    "fuzzy_transpositions" : true,
                    "lenient" : false,
                    "zero_terms_query" : "NONE",
                    "boost" : 1.0
                  }
                }
              },
              {
                "match" : {
                  "camp_id" : {
                    "query" : 2,
                    "operator" : "OR",
                    "prefix_length" : 0,
                    "max_expansions" : 50,
                    "fuzzy_transpositions" : true,
                    "lenient" : false,
                    "zero_terms_query" : "NONE",
                    "boost" : 1.0
                  }
                }
              }
            ],
            "disable_coord" : false,
            "adjust_pure_negative" : true,
            "boost" : 1.0
          }
        },
        {
          "bool" : {
            "should" : [
              {
                "match" : {
                  "ad_id" : {
                    "query" : 1,
                    "operator" : "OR",
                    "prefix_length" : 0,
                    "max_expansions" : 50,
                    "fuzzy_transpositions" : true,
                    "lenient" : false,
                    "zero_terms_query" : "NONE",
                    "boost" : 1.0
                  }
                }
              }
            ],
            "disable_coord" : false,
            "adjust_pure_negative" : true,
            "boost" : 1.0
          }
        }
      ],
      "disable_coord" : false,
      "adjust_pure_negative" : true,
      "boost" : 1.0
    }
  }
}

There is only one must clause which wraps both camp_id and ad_id. Can someone please point out what am I missing? I am using elastic search version - 5.5.0 and jest - 2.4.0 as my java client.


(Milind Karandikar) #2

I am using cerebro as a UI to access elasticsearch cluster. When I tried executing the query from the rest client provided by cerebro, I observed that it gives me an error, stating that - Duplicate must clause. Is it that we can't have multiple must clauses? If not, there should be a way to handle query like SELECT * FROM campaign WHERE id IN (1,2,3) AND name IN ('a', 'b', 'c'). Can someone help me out


#3

You cannot have a multiple must clauses,but what you can do is provide multiple queries in a must clause separated by commas,something like this.

"query":{
"bool": {
  "must": [
    {"match": {
       "field1" : "value1"
    }
  },
    {
      "match": {
        "field2": "value2"
      }
    }
  ]
}

}


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.