Query latest 10 parent documents and corresponding child document if exists

Hello,
Lets say i want to store Songs data in elasticsearch where users can write comments for any of the songs.
I am looking at option where i can get 10 latest songs (based on say cretedTime field) and corresponding reviews(if review exists). If i use the parent child
relationship, i am not sure how to get 10 latest songs(parent document) and reviews(child document) for these songs( please not all songs may not have corresponding reviews).
Is there other way to achieve this requirement?

Hi,

is the "createdTime" a property of the song (parent) or the review (child)? Would it be set once when a song gets entered into the system? Would those songs even already have reviews (given that they might be just recently created)?

Thanks a lot for the quick response. Really appreciate the help on this.

CreatedTime is a property of Songs.
Songs : {
SongName,
SongTittle,
Album,
Artists,
CreatedTime
.....
}

Comments : {
commentedBy,
comment
}

Lets take one example, among the 10 recently created songs, say Song 3 doesnt have any comments and Song 2 has 5 comments.
The query response should include Song 3 as well though it doesnt have any comments.
How to achieve this is the question.

Thanks in advance

What's the best design wrt elastic search index for item and reviews kind of use case?

Hi,

one way of doing this is to first filter your parent docs by the "join" field and sorting on the "created_at" field. Then you want to include the corresponding child documents using "has_child" and "inner_hits":

DELETE my_index

PUT my_index
{
  "mappings": {
    "properties": {
      "song_id": {
        "type": "keyword"
      },
      "created_at" : {
        "type": "date"
      },
      "join_field": { 
        "type": "join",
        "relations": {
          "song": "review" 
        }
      }
    }
  }
}

PUT my_index/_doc/1?refresh
{
  "song_id": "song1",
  "text": "Lalala",
  "created_at" : "2020-01-01",
  "join_field": "song"
}

PUT my_index/_doc/2?refresh
{
  "song_id": "song2",
  "created_at" : "2020-01-02",
  "text": "Yeah Yeah Yeah",
  "join_field": "song"
}

PUT my_index/_doc/3?routing=1&refresh 
{
  "text": "This is an review",
  "join_field": {
    "name": "review", 
    "parent": "1" 
  }
}

PUT my_index/_doc/4?routing=1&refresh
{
  "text": "This is another review",
  "join_field": {
    "name": "review",
    "parent": "1"
  }
}

POST /my_index/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "match": {
            "join_field": "song"
          }
        }
      ],
      "should": [
        {
          "has_child": {
            "type": "review",
            "query": { "match_all": {}},
            "inner_hits": {} 
          }
        }
      ]
    }
  },
  "sort": [
    {
      "created_at": {
        "order": "desc"
      }
    }
  ]
}

Note that I added the "has_child" query inside a "should" clause, this way you also get the document without any reviews, but there might be other, better ways of doing this. "inner_hits" by default only returns 3 documents afaik, if you need more you might want to set its "size" parameter higher. There are certainly other, more complex considerations, but this might give you a good start to play with.

This is excellent.. thanks for the help. Awesome.
I will do some experimentts and will get back to you.

Thanks a ton

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.