ElasticSearch 2.x : more_like_this query and nested objects

I just discovered the "more_like_this" query type and tried to used it with my nested objects. Unfortunatelly, it seems this query is not able to search inside nested objects. Here is my mapping :

{
   "Presentation":{
      "properties":{
         "id":{
            "include_in_all":false,
            "type":"string"
         },
         "title":{
            "include_in_all":true,
            "type":"string"
         },
         "description":{
            "include_in_all":true,
            "type":"string"
         },
         "categories":{
            "properties":{
               "id":{
                  "include_in_all":false,
                  "type":"string"
               },
               "category":{
                  "include_in_all":true,
                  "type":"string"
               },
               "category_suggest":{
                  "properties":{
                     "input":{
                        "type":"string"
                     },
                     "payload":{
                        "properties":{
                           "id":{
                              "type":"long"
                           }
                        }
                     }
                  }
               }
            },
            "type":"nested"
         }
      }
   }

My goal is to find all related presentations to the id "96", and giving a boost to the one having the same category than the "96". But, when executing the query below, Elasticsearch is only calculating the score on "title" and "description" fields (and not looking at "category").

{
  "size": 4,
  "query": {
    "more_like_this": {
      "like": [
        {
          "_index": "client",
          "_type": "Presentation",
          "_id": "96"
        }
      ],
      "min_term_freq": 1,
      "max_query_terms": 35,
      "min_word_length": 3,
      "minimum_should_match": "1%"
    }
  }
} 

I tried to force the query on the nested field too, but it is not working either :

{
  "size": 4,
  "query": {
    "bool": {
      "should": [
        {
          "more_like_this": {
            "like": [
              {
                "_index": "client",
                "_type": "Presentation",
                "_id": "96"
              }
            ],
            "min_term_freq": 1,
            "max_query_terms": 35,
            "min_word_length": 3,
            "minimum_should_match": "1%"                   
          }
        },
        {
            "nested" : {
                "path":"categories",
                "query" : {
                    "more_like_this": {
                        "like": [
                          {
                            "_index": "client",
                            "_type": "Presentation",
                            "_id": "96"
                          }
                        ],
                        "min_term_freq": 1,
                        "max_query_terms": 35,
                        "min_word_length": 3,
                        "minimum_should_match": "1%"
                    }
                }
            }
        }
      ]
    }
  }
}

I found this guy having the same issue, but with an older version of elasticsearch : http://stackoverflow.com/questions/27961412/elasticsearch-more-like-this-api-and-nested-object-properties and, unfortunately, no answer has been given that could work with ES 2.x (except flatten the entire index, that I could'nt do).

Does any one of you has any idea about this (strange) issue ? Thanks :slight_smile:

1 Like

I've been struggling with this as well so thought I'd add my observations :slight_smile: Hopefully we can find an answer soon!

I noticed that /index/type/_id/_termvectors doesn't return anything when using nested documents. I tried instead saving the subdocuments as non nested as array which indeed now showed term vectors for the field...however MLT still didn't work. Only seems to work if the field is one big blob of text to be analysed

Hi there !

Went in to ask the same question,

I tried the following syntax

{
      "query":{
        "nested":{
           "path":"article",
               "query":{
                  "mlt":{
                    "fields":["article.title.search","article.text.search"],
                    "max_query_terms": 20,
                    "min_term_freq": 1,
                    "include": "false",
                    "like":[{
                         "_index":"myindex",
                         "_type":"event",
                         "doc":{
                               "article":{
                                     "title":"this is the title",
                                       "text":"this is the body of the article"
                              }
                         }
                    }]
                 }
              }
           }
        }
     }

But it always returns 0 hits
Is there a known limitation of nested objects here or is that a bug ?

Hi there,

new input, my syntax would work with "like":"some text"
but not with

"like":{
  "_index":"myindex",
    "_type":"event",
      "doc":{
        "article":{
            "title":"this is the title",
            "text":"this is the body of the article"
         }
     }
 }

nor

"like":{
     "_index":"myindex",
     "_type":"event",
     "_id":"an_event_id"
  }

I can investigate furthermore but, I think that the syntax index/type/[_id,doc] as trouble to reach the nested level, could that be linked to the multi get api ? I could I Investigate what is happenning under the hood ?