Query ElasticSearch for documents with terms matching exclusively

I was porting a SQL query on ElasticSearch.

The query needs to substitute an IN clause.

Following this link I implemented the IN this way:

{
  "size": 1,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "products.flights.legs.hops.hopFlight.airlineId": [
                  "ib",
                  "lh"
                ]
              }
            }
          ]
        }
      }
    }
  }
}

It is working but probably I have a special case: you see, the hops field in the document is an array, so for a flight with one stop we have two hops.

In that case this query works even when just one of the two hops has ib or lh. It matches a document even when only one of the two hops has one of the airlines and the other hop is a different airline, not include in my terms.

I actually want to return only documents that, as airlineId, have only ib or lh or a combination of both.

Is it possible to do it?

I don't quite understand what exactly you are trying to achieve and without seeing your mapping and test data it's a bit difficult to show a concrete solution, but take a look at the terms set query. I might be wrong, but I think this is what you are looking for.

Hello Igor,
I'll try to be more clear. Let's say I have these documents:

Document A

{
  "group" : "fans",
  "hops" : [ 
    {
      "airlineId" : "ih",
      "last" :  "Smith"
    },
    {
      "airlineId" : "u2",
      "last" :  "White"
    }
  ]
}

Document B

{
  "group" : "fans",
  "user" : [ 
    {
      "airlineId" : "lh",
      "last" :  "Smith"
    },
    {
      "airlineId" : "lh",
      "last" :  "White"
    }
  ]
}

Document C

{
  "group" : "fans",
  "user" : [ 
    {
      "airlineId" : "ib",
      "last" :  "Smith"
    },
    {
      "airlineId" : "lh",
      "last" :  "White"
    }
  ]
}

If I execute the query

{
  "size": 3,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "products.flights.legs.hops.hopFlight.airlineId": [
                  "ib",
                  "lh"
                ]
              }
            }
          ]
        }
      }
    }
  }
}

the result will be: Document A, Document B and Document C.

What I actually want is just Document B and Document C because they include hops with just ib, lh or both of them, the document with u2 it's not correct for my use case.

I think that the problem is in the index I'm using which is:

"airlineId": {
    "type": "string",
    "analyzer": "just_lowercase",
    "ignore_above": 10922
    }

which is simple string, but at the moment I cannot change it.

You will probably need to add an additional field with a de-duped list of airline ids from all hops and then use terms set query on it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.