How would you implement date/datetime filtering for a booking application?


(Andy-2) #1

Hi,

I'm working on a booking application using elasticsearch for search &
filtering.

Basically each document describes a service, such as an educational
class. And each class has multiple date or datetime blocks during
which it is available.

For example, the class "1-day Argentine Tango entry level class in
Brooklyn" could be available in the following blocks:

8/9/11 to 8/11/11 (so the class is available in 3 different days: 8/9,
8/10, 8/11)
8/15/11 to 8/15/11 (meaning just 1 single day)
8/17/11 to 8/21/11

And the class "1-hour Physics 101 tutorial" could be available in the
following blocks:

8/5/11 2pm - 5pm (so the slots 2pm to 3pm, 3 to 4, 4 to 5 are open)
8/7/11 9am - 12pm

A users might search for "Physics tutorial" and want to filter the
results to those that are available between 8/5/11 3pm to 8/5/11 5pm.

What's the best way to implement such search/filtering in ES?

  1. I suppose a simple way is to index the class description multiple
    times, once for each available block. So for the class "1-day
    Argentine Tango entry level class in Brooklyn" above, 3 documents
    would be indexed. They all have the text "1-day Argentine Tango entry
    level class in Brooklyn" as description. The 1st document would have
    startdate: "2011-8-5"
    enddate:"2011-8-11"
    The 2nd, 3rd documents would also have the appropriate startdate &
    enddate.
    And when I want to search for classes between the 8/7 and 8/9, I could
    just add the filter:
    "range" : {
    "startdate" : {
    "lte" : "2011-8-7"
    }
    "enddate" : {
    "gte" : "2011-8-9"
    }
    }

The problem with this approach is I have to repeat the description
many times. So if a class is available in 100 blocks, I'd have to
index 100 separate documents each have the same description. Is there
a better way?

  1. How do I index and filter on datetime such as "8/5/11 2pm"? Is
    there a "datetime" field type?

Thanks.


(Shay Banon) #2

Have a look at nested documents (
http://www.elasticsearch.org/guide/reference/mapping/nested-type.html). You
can have the timeslots be mapped as nested type in the json. This means that
nested queries / filters for startdate and enddate will only match on if
both match on a nested object "instance".

On Tue, Aug 2, 2011 at 9:40 PM, Andy selforganized@gmail.com wrote:

Hi,

I'm working on a booking application using elasticsearch for search &
filtering.

Basically each document describes a service, such as an educational
class. And each class has multiple date or datetime blocks during
which it is available.

For example, the class "1-day Argentine Tango entry level class in
Brooklyn" could be available in the following blocks:

8/9/11 to 8/11/11 (so the class is available in 3 different days: 8/9,
8/10, 8/11)
8/15/11 to 8/15/11 (meaning just 1 single day)
8/17/11 to 8/21/11

And the class "1-hour Physics 101 tutorial" could be available in the
following blocks:

8/5/11 2pm - 5pm (so the slots 2pm to 3pm, 3 to 4, 4 to 5 are open)
8/7/11 9am - 12pm

A users might search for "Physics tutorial" and want to filter the
results to those that are available between 8/5/11 3pm to 8/5/11 5pm.

What's the best way to implement such search/filtering in ES?

  1. I suppose a simple way is to index the class description multiple
    times, once for each available block. So for the class "1-day
    Argentine Tango entry level class in Brooklyn" above, 3 documents
    would be indexed. They all have the text "1-day Argentine Tango entry
    level class in Brooklyn" as description. The 1st document would have
    startdate: "2011-8-5"
    enddate:"2011-8-11"
    The 2nd, 3rd documents would also have the appropriate startdate &
    enddate.
    And when I want to search for classes between the 8/7 and 8/9, I could
    just add the filter:
    "range" : {
    "startdate" : {
    "lte" : "2011-8-7"
    }
    "enddate" : {
    "gte" : "2011-8-9"
    }
    }

The problem with this approach is I have to repeat the description
many times. So if a class is available in 100 blocks, I'd have to
index 100 separate documents each have the same description. Is there
a better way?

  1. How do I index and filter on datetime such as "8/5/11 2pm"? Is
    there a "datetime" field type?

Thanks.


(Shay Banon) #3

Ahh, and for the datatime thingy, you can create a custom date format to
represent the format you are after, but, I strongly suggest using the ISO
format. See core types date type:
http://www.elasticsearch.org/guide/reference/mapping/core-types.html, and a
link to the date format is there.

On Tue, Aug 2, 2011 at 10:08 PM, Shay Banon kimchy@gmail.com wrote:

Have a look at nested documents (
http://www.elasticsearch.org/guide/reference/mapping/nested-type.html).
You can have the timeslots be mapped as nested type in the json. This means
that nested queries / filters for startdate and enddate will only match on
if both match on a nested object "instance".

On Tue, Aug 2, 2011 at 9:40 PM, Andy selforganized@gmail.com wrote:

Hi,

I'm working on a booking application using elasticsearch for search &
filtering.

Basically each document describes a service, such as an educational
class. And each class has multiple date or datetime blocks during
which it is available.

For example, the class "1-day Argentine Tango entry level class in
Brooklyn" could be available in the following blocks:

8/9/11 to 8/11/11 (so the class is available in 3 different days: 8/9,
8/10, 8/11)
8/15/11 to 8/15/11 (meaning just 1 single day)
8/17/11 to 8/21/11

And the class "1-hour Physics 101 tutorial" could be available in the
following blocks:

8/5/11 2pm - 5pm (so the slots 2pm to 3pm, 3 to 4, 4 to 5 are open)
8/7/11 9am - 12pm

A users might search for "Physics tutorial" and want to filter the
results to those that are available between 8/5/11 3pm to 8/5/11 5pm.

What's the best way to implement such search/filtering in ES?

  1. I suppose a simple way is to index the class description multiple
    times, once for each available block. So for the class "1-day
    Argentine Tango entry level class in Brooklyn" above, 3 documents
    would be indexed. They all have the text "1-day Argentine Tango entry
    level class in Brooklyn" as description. The 1st document would have
    startdate: "2011-8-5"
    enddate:"2011-8-11"
    The 2nd, 3rd documents would also have the appropriate startdate &
    enddate.
    And when I want to search for classes between the 8/7 and 8/9, I could
    just add the filter:
    "range" : {
    "startdate" : {
    "lte" : "2011-8-7"
    }
    "enddate" : {
    "gte" : "2011-8-9"
    }
    }

The problem with this approach is I have to repeat the description
many times. So if a class is available in 100 blocks, I'd have to
index 100 separate documents each have the same description. Is there
a better way?

  1. How do I index and filter on datetime such as "8/5/11 2pm"? Is
    there a "datetime" field type?

Thanks.


(Clinton Gormley) #4

Hi Andy

  1. I suppose a simple way is to index the class description multiple
    times, once for each available block. So for the class "1-day
    Argentine Tango entry level class in Brooklyn" above, 3 documents
    would be indexed. They all have the text "1-day Argentine Tango entry
    level class in Brooklyn" as description. The 1st document would have
    startdate: "2011-8-5"
    enddate:"2011-8-11"
    The 2nd, 3rd documents would also have the appropriate startdate &
    enddate.

Before 0.17, you wouldn't be able to do what you're wanting in a single
doc.

For instance, if you have the field "dates", with type "object":

dates: [
{ startdate: "2011-08-01", enddate: "2011-08-05" },
{ startdate: "2011-08-10", enddate: "2011-08-15" }
]

and you tried a range query like:

{ bool: {
must: [
{ range: { startdate: { 'gte': '2011-08-06' }},
{ range: { enddate: { 'lte': '2011-08-09' }}
]
}}

... it would have found this doc incorrectly, because it just looks at
ANY startdate and ANY enddate in the doc. It doesn't correlate
startdate and enddate within the same sub-doc.

However, as of 0.17, kimchy has performed a small miracle with the
introduction of "nested" docs.

A nested doc is pretty much the same as an "object" except that "nested"
means that it is treated as an independent document in its own right.

Which means that you CAN correlate fields within a single sub-doc.

To use nested docs, you have to specify them up front in the mapping:
http://www.elasticsearch.org/guide/reference/mapping/nested-type.html

eg :

curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1' -d '
{
"mappings" : {
"class" : {
"properties" : {
"name" : { "type" : "string" },
"dates" : {
"include_in_root" : 1,
"type" : "nested",
"properties" : {
"enddate" : { "type" : "date" },
"startdate" : { "type" : "date" }
}
},
}
}
}
}
'

Then you can do a nested query or filter like:

curl -XGET 'http://127.0.0.1:9200/foo/class/_search?pretty=1' -d '
{
"query" : {
"nested" : {
"path" : "dates",
"query" : {
"bool" : {
"must" : [
{ "range" : {
"dates.enddate" : { "gte" : "2011-08-05" }
}},
{ "range" : {
"dates.startdate" : { "gte" : "2011-08-01" }
}}
]
}
}
}
}
}
'

http://www.elasticsearch.org/guide/reference/query-dsl/nested-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/nested-filter.html

And when I want to search for classes between the 8/7 and 8/9, I could
just add the filter:
"range" : {
"startdate" : {
"lte" : "2011-8-7"
}
"enddate" : {
"gte" : "2011-8-9"
}
}

Note - your query won't work - you can't specify multiple fields under
the "range" key

  1. How do I index and filter on datetime such as "8/5/11 2pm"? Is
    there a "datetime" field type?

Have a look at
http://www.elasticsearch.org/guide/reference/mapping/date-format.html

clint


(Andy-2) #5

Ahh I see. Thank you Shay and Clint.

For the nested-documents, does the sub-doc have its own _uid that's
different from the _uid of the parent?

Also if I want to update just 1 specific sub-doc, can I do that or I'd
have to update the entire parent doc together with all the sub-docs?
For example if a nested-doc has:
dates: [
{ startdate: "2011-08-01", enddate: "2011-08-05" },
{ startdate: "2011-08-07", enddate: "2011-08-09" }
]

and someone books the time block 2011-08-03 to 2011-08-04. In this
case I'd need to update the sub-doc
{ startdate: "2011-08-01", enddate: "2011-08-05" }
to 2 different sub-docs:
{ startdate: "2011-08-01", enddate: "2011-08-02" }
{ startdate: "2011-08-05", enddate: "2011-08-05" }

Is there a way to just "update" that specific sub-doc? Or I'd have to
update the entire document by specifying:
dates: [
{ startdate: "2011-08-01", enddate: "2011-08-02" },
{ startdate: "2011-08-05", enddate: "2011-08-05" },
{ startdate: "2011-08-07", enddate: "2011-08-09" }
]

Andy

On Aug 2, 3:10 pm, Clinton Gormley cl...@traveljury.com wrote:

Hi Andy

  1. I suppose a simple way is to index the class description multiple
    times, once for each available block. So for the class "1-day
    Argentine Tango entry level class in Brooklyn" above, 3 documents
    would be indexed. They all have the text "1-day Argentine Tango entry
    level class in Brooklyn" as description. The 1st document would have
    startdate: "2011-8-5"
    enddate:"2011-8-11"
    The 2nd, 3rd documents would also have the appropriate startdate &
    enddate.

Before 0.17, you wouldn't be able to do what you're wanting in a single
doc.

For instance, if you have the field "dates", with type "object":

dates: [
{ startdate: "2011-08-01", enddate: "2011-08-05" },
{ startdate: "2011-08-10", enddate: "2011-08-15" }
]

and you tried a range query like:

{ bool: {
must: [
{ range: { startdate: { 'gte': '2011-08-06' }},
{ range: { enddate: { 'lte': '2011-08-09' }}
]
}}

... it would have found this doc incorrectly, because it just looks at
ANY startdate and ANY enddate in the doc. It doesn't correlate
startdate and enddate within the same sub-doc.

However, as of 0.17, kimchy has performed a small miracle with the
introduction of "nested" docs.

A nested doc is pretty much the same as an "object" except that "nested"
means that it is treated as an independent document in its own right.

Which means that you CAN correlate fields within a single sub-doc.

To use nested docs, you have to specify them up front in the mapping:http://www.elasticsearch.org/guide/reference/mapping/nested-type.html

eg :

curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1' -d '
{
"mappings" : {
"class" : {
"properties" : {
"name" : { "type" : "string" },
"dates" : {
"include_in_root" : 1,
"type" : "nested",
"properties" : {
"enddate" : { "type" : "date" },
"startdate" : { "type" : "date" }
}
},
}
}
}}

'

Then you can do a nested query or filter like:

curl -XGET 'http://127.0.0.1:9200/foo/class/_search?pretty=1' -d '
{
"query" : {
"nested" : {
"path" : "dates",
"query" : {
"bool" : {
"must" : [
{ "range" : {
"dates.enddate" : { "gte" : "2011-08-05" }
}},
{ "range" : {
"dates.startdate" : { "gte" : "2011-08-01" }
}}
]
}
}
}
}}

'

http://www.elasticsearch.org/guide/reference/query-dsl/nested-query.htmlhttp://www.elasticsearch.org/guide/reference/query-dsl/nested-filter....

And when I want to search for classes between the 8/7 and 8/9, I could
just add the filter:
"range" : {
"startdate" : {
"lte" : "2011-8-7"
}
"enddate" : {
"gte" : "2011-8-9"
}
}

Note - your query won't work - you can't specify multiple fields under
the "range" key

  1. How do I index and filter on datetime such as "8/5/11 2pm"? Is
    there a "datetime" field type?

Have a look athttp://www.elasticsearch.org/guide/reference/mapping/date-format.html

clint


(Shay Banon) #6

With nested docs, you can't access the "sub document", you have to update
back the whole document. This is what makes it fast for those "join"
operations.

On Wed, Aug 3, 2011 at 1:56 AM, Andy selforganized@gmail.com wrote:

Ahh I see. Thank you Shay and Clint.

For the nested-documents, does the sub-doc have its own _uid that's
different from the _uid of the parent?

Also if I want to update just 1 specific sub-doc, can I do that or I'd
have to update the entire parent doc together with all the sub-docs?
For example if a nested-doc has:
dates: [
{ startdate: "2011-08-01", enddate: "2011-08-05" },
{ startdate: "2011-08-07", enddate: "2011-08-09" }
]

and someone books the time block 2011-08-03 to 2011-08-04. In this
case I'd need to update the sub-doc
{ startdate: "2011-08-01", enddate: "2011-08-05" }
to 2 different sub-docs:
{ startdate: "2011-08-01", enddate: "2011-08-02" }
{ startdate: "2011-08-05", enddate: "2011-08-05" }

Is there a way to just "update" that specific sub-doc? Or I'd have to
update the entire document by specifying:
dates: [
{ startdate: "2011-08-01", enddate: "2011-08-02" },
{ startdate: "2011-08-05", enddate: "2011-08-05" },
{ startdate: "2011-08-07", enddate: "2011-08-09" }
]

Andy

On Aug 2, 3:10 pm, Clinton Gormley cl...@traveljury.com wrote:

Hi Andy

  1. I suppose a simple way is to index the class description multiple
    times, once for each available block. So for the class "1-day
    Argentine Tango entry level class in Brooklyn" above, 3 documents
    would be indexed. They all have the text "1-day Argentine Tango entry
    level class in Brooklyn" as description. The 1st document would have
    startdate: "2011-8-5"
    enddate:"2011-8-11"
    The 2nd, 3rd documents would also have the appropriate startdate &
    enddate.

Before 0.17, you wouldn't be able to do what you're wanting in a single
doc.

For instance, if you have the field "dates", with type "object":

dates: [
{ startdate: "2011-08-01", enddate: "2011-08-05" },
{ startdate: "2011-08-10", enddate: "2011-08-15" }
]

and you tried a range query like:

{ bool: {
must: [
{ range: { startdate: { 'gte': '2011-08-06' }},
{ range: { enddate: { 'lte': '2011-08-09' }}
]
}}

... it would have found this doc incorrectly, because it just looks at
ANY startdate and ANY enddate in the doc. It doesn't correlate
startdate and enddate within the same sub-doc.

However, as of 0.17, kimchy has performed a small miracle with the
introduction of "nested" docs.

A nested doc is pretty much the same as an "object" except that "nested"
means that it is treated as an independent document in its own right.

Which means that you CAN correlate fields within a single sub-doc.

To use nested docs, you have to specify them up front in the mapping:
http://www.elasticsearch.org/guide/reference/mapping/nested-type.html

eg :

curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1' -d '
{
"mappings" : {
"class" : {
"properties" : {
"name" : { "type" : "string" },
"dates" : {
"include_in_root" : 1,
"type" : "nested",
"properties" : {
"enddate" : { "type" : "date" },
"startdate" : { "type" : "date" }
}
},
}
}
}}

'

Then you can do a nested query or filter like:

curl -XGET 'http://127.0.0.1:9200/foo/class/_search?pretty=1' -d '
{
"query" : {
"nested" : {
"path" : "dates",
"query" : {
"bool" : {
"must" : [
{ "range" : {
"dates.enddate" : { "gte" : "2011-08-05" }
}},
{ "range" : {
"dates.startdate" : { "gte" : "2011-08-01" }
}}
]
}
}
}
}}

'

http://www.elasticsearch.org/guide/reference/query-dsl/nested-query.htmlhttp://www.elasticsearch.org/guide/reference/query-dsl/nested-filter..
..

And when I want to search for classes between the 8/7 and 8/9, I could
just add the filter:
"range" : {
"startdate" : {
"lte" : "2011-8-7"
}
"enddate" : {
"gte" : "2011-8-9"
}
}

Note - your query won't work - you can't specify multiple fields under
the "range" key

  1. How do I index and filter on datetime such as "8/5/11 2pm"? Is
    there a "datetime" field type?

Have a look athttp://
www.elasticsearch.org/guide/reference/mapping/date-format.html

clint


(Dylan Pierce) #7

Just bumping this thread because there's not an answer I can find regarding nested date range querying for searching booking availability.

GET /room/_search/
{
    "query" : {
        "constant_score" : {
            "filter" : {
                "bool" : {
                  "must_not" : [
                    {
                      "range" : {
                          "in_use_dates.start_date": {
                            "gte" : "2017-07-27T00:00:00.000Z"
                          }
                      }
                    },
                    {
                        "range": { 
                          "in_use_dates.end_date": {
                            "lte" : "2017-07-31T00:00:00.000Z"
                          }
                        }
                    }
                  ]
                }
            }
        }
    }
}

I would expect this query to exclude any rooms that are not booked from 07-27-2017 through 07-31-2017, but they are still shown from my query above.

My guess is that elasticsearch is treating the nested start_date's as a column and not as a pairing of 1 start_date to end_date.


(system) #8