What you're describing is a many-to-many relationship (a book can have multiple readers, and a reader can have read multiple books). Elasticsearch does not really have a construct for these N:N relationships. The "Handling Relationships" section of "Elasticsearch: The Definitive Guide" provides some good guidance on dealing with relational data. (That book is a bit outdated, but considering you're still on version 5 most of it will still be applicable)
There are various solutions to what you're trying to achieve, and one is not necessarily better than the other. It all depends on how you're going to query the data and what you want Elasticsearch to return. Based on what you've provided I'd go with a reader-centric index, in which every reader becomes an Elasticsearch document. To every reader document you could then add an array of nested objects with all the books that they've read ("denormalization"). You can query these nested objects using a nested query.
It would look something like this. You define the index with a reader type mapping, in which a books
field is mapped as a nested type:
PUT books
{
"mappings": {
"reader": {
"dynamic": "strict",
"properties": {
"username": {
"type": "text"
},
"birthDay": {
"type": "date"
},
"books": {
"type": "nested",
"dynamic": "strict",
"properties": {
"name": {
"type": "text"
},
"publishDate": {
"type": "date"
}
}
}
}
}
}
}
You can now index your documents. One document per reader, with an array of the books that they've read:
PUT books/reader/1
{
"username": "Jane Doe",
"birthDay": "2001-01-01",
"books": [
{
"name": "Elasticsearch: The Definitive Guide",
"publishDate": "2015-02-07"
},
{
"name": "Stranger in a Strange Land",
"publishDate": "1961-06-01"
}
]
}
PUT books/reader/2
{
"username": "John Smith",
"birthDay": "1970-07-07",
"books": [
{
"name": "Elasticsearch: The Definitive Guide",
"publishDate": "2015-02-07"
}
]
}
Now, if you want to find all readers who have been born after year 2000, and have read a book published before 1990, you could write a query like this:
GET books/reader/_search
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "books",
"query": {
"range": {
"books.publishDate": {
"lt": "1990-01-01"
}
}
}
}
},
{
"range": {
"birthDay": {
"gt": "2000-01-01"
}
}
}
]
}
}
}
This query only returns Jane, as she's read a book published in 1961, and she was born in 2001.