Search for documents matching all terms in a nested array


(Mayur Rao) #1

I am learning to use Elasticsearch as a basic recommender engine. My elasticsearch document contains records with nested entities as follows

PUT recs/user/1
{
  "name" : "Brad Pitt",
  "movies_liked": [
    {
      "name": "Forrest Gump",
      "score": 1
    },
    {
      "name": "Terminator",
      "score": 4
    },
    {
      "name": "Rambo",
      "score": 4
    },
    {
      "name": "Rocky",
      "score": 4
    },
    {
      "name": "Good Will Hunting",
      "score": 2
    }
  ]
}

PUT recs/user/2
{
  "name" : "Tom Cruise",
  "movies_liked": [
    {
      "name": "Forrest Gump",
      "score": 2
    },
    {
      "name": "Terminator",
      "score": 1
    },
    {
      "name": "Rocky IV",
      "score": 1
    },
    {
      "name": "Rocky",
      "score": 1
    },
    {
      "name": "Rocky II",
      "score": 1
    },
    {
      "name": "Predator",
      "score": 4
    }
  ]
}

The mapping is

{ "mappings": { "user": { "properties": { "name": { "type": "text", "analyzer": "standard", "search_analyzer": "standard" }, "movies_liked": { "type": "nested", "properties": { "name": { "type": "keyword" }, "score": { "type": "double" } } }, "required_matches": { "type": "long" } } } } }

I would like to search for users who specifically like "Forrest Gump","Terminator" and "Rambo".

I have used a nested query which currently looks like this

POST recs/user/_search
{
  "query": {
    "nested": {
      "path": "movies_liked",
      "query": {
        "terms": {
          "movies_liked.name": ["Forrest Gump","Terminator","Rambo"]

          }
        }

    }
  }
}

However when I execute this search, I expected to see only the first record which has all the required terms, but in the results I am getting both the records. In the second record the user clearly does not have "Rambo" in his liked list. I understand that this query is doing an "OR" operation with the given terms, How do I tweak this query to do an "AND" operation so that only the records having all the terms get matched?

P.S I also tried the match query as follows, but this returns no results

POST recs/user/_search
{
  "query": {
    "nested": {
      "path": "movies_liked",
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "movies_liked.name": "Forrest Gump"
              }
            },
            {
              "term": {
                "movies_liked.name": "Rambo"
              }
            },
            {
              "term": {
                "movies_liked.name": "Terminator"
              }
            }
          ]
        }
      }
    }
  }
  }

(Makoto Nozawa) #2

Hi,

How about a query like this?

GET recs/user/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "movies_liked",
            "query": {
              "term": {
                "movies_liked.name": {
                  "value": "Forrest Gump"
                }
              }
            }
          }
        },
        {
          "nested": {
            "path": "movies_liked",
            "query": {
              "term": {
                "movies_liked.name": {
                  "value": "Rambo"
                }
              }
            }
          }
        },
        {
          "nested": {
            "path": "movies_liked",
            "query": {
              "term": {
                "movies_liked.name": {
                  "value": "Terminator"
                }
              }
            }
          }
        }
      ]
    }
  }
}

Regards,

Makoto


(Mayur Rao) #3

Thank you very much. It did not occur to me to use nested within must clause. This is working!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.