Birthday Querying

Hi guys,

I would like to know if anyone else has worked with birthday queries before.

I have contact information, for which I have the birthdate and I would like to be able to perform queries like

  1. Is on a specific day/month (this is not hard as I could just store the date in components)
  2. Has birthday in x days
  3. Has birthday in less than x days
  4. Birthday was no more than x days ago
    etc. (similar questions)

I am thinking about using script queries for different scenarios but

  1. This seems quite complex
  2. Doesn't seem too efficient

Any ideas?

Bogdan

Hi @Beda_Bogdan,

range queries are a good fit for your use-case and should also perform well.

Daniel

Excuse my stupidity! Could you give me a complete example for 2. Has birthday in x days, by example?

My problem is that I have a date (date of birth) but that date is actually a recurring date and the query needs to understand that (basically, I need a range query which ignores years and 'knows' such dates are cyclic.)

Thank you

Hi @Beda_Bogdan,

you're right. Range queries solve only one half of the problem, my bad. I can think of two options for your case:

  1. You have a periodic job that sets the date of the next birthday (which is a separate field from your birthday field!) of each person (e.g. someone is born on 5th October 1950, then your job would set it to 5th October 2017). You can then use a simple range query to solve that.
  2. You can also use script queries. Note that if you use Elasticsearch 5, Painless is the default scripting language and it does not support a concept of "now" at the moment (as "now" refers to server time, not client time). Hence you need to pass "now" as a parameter from the client.

It basically boils down to a trade-off between storing additional information at index time (solution 1) and slower queries (solution 2).

An example query for solution 1. is:

GET /people/_search
{
    "query": {
        "range": {
           "next_birthday": {
              "gte": "now",
              "lte": "now+10d/d"
           }
        }
    }
}

This query returns all people with a birthday within the next 10 days.

If you want to solve this with the script queries, consider this example (needs Elasticsearch 5.0 or better):

DELETE /people

PUT /people
{
   "mappings": {
      "type": {
         "properties": {
            "birthday": {
               "type": "date"
            },
            "name": {
               "type": "keyword"
            }
         }
      }
   }
}



POST /people/type
{
    "name": "Fred",
    "birthday": "1990-01-12"
}

GET /people/_search
{
   "query": {
      "bool": {
         "must": {
            "script": {
               "script": {
                  "inline": "def today = LocalDate.parse(params.today); def birthday = Instant.ofEpochMilli(doc['birthday'].value).atZone(ZoneId.systemDefault()).toLocalDate(); def thisYearsBirthday = birthday.withYear(today.get(ChronoField.YEAR)); def nextBirthday; if (thisYearsBirthday.isBefore(today)) { nextBirthday = birthday.withYear(today.get(ChronoField.YEAR) + 1); } else { nextBirthday = thisYearsBirthday;} def inNDays = today.plusDays(params.daysFuture); (nextBirthday.isEqual(today) || nextBirthday.isAfter(today)) && (nextBirthday.isEqual(inNDays) || nextBirthday.isBefore(inNDays))",
                  "lang": "painless",
                  "params": {
                     "today": "2017-01-09",
                     "daysFuture": 10
                  }
               }
            }
         }
      }
   }
}

The script is an inline script so it is not easily readable. So you should define this as a stored script in production:

def birthday = Instant.ofEpochMilli(doc['birthday'].value).atZone(ZoneId.systemDefault()).toLocalDate();
def today = LocalDate.parse(params.today); 
def thisYearsBirthday = birthday.withYear(today.get(ChronoField.YEAR));
def nextBirthday;
if (thisYearsBirthday.isBefore(today)) {
    nextBirthday = birthday.withYear(today.get(ChronoField.YEAR) + 1);
 } else {
    nextBirthday = thisYearsBirthday;
}
def inNDays = today.plusDays(params.daysFuture); 
return (nextBirthday.isEqual(today) || nextBirthday.isAfter(today)) 
            && (nextBirthday.isEqual(inNDays) || nextBirthday.isBefore(inNDays));

Obviously the first solution is much faster from a performance perspective, so if you have a lot of data to query I suggest you use solution 1, if you don't want the additional complexity of an update job and can live with the performance, you can use solution 2.

Don't forget to think how you need to handle birthdays on February 29th. This may depend on your business requirements. In solution 2, you also need to think about time-zones (I just used ZoneId.systemDefault() in my example).

I hope this additional explanation helps you solve your problem.

Daniel

Hi,

Thank you for the extensive reply.

I was actually thinking about both. For the 1st option, I would need to store both past and next closest birtday dates to be able to perform queries both ways. It is an option I was strongly considering.

Also, regarding scripting, I am using ES 2.3 at the moment so I was trying with groovy. At the moment I was actually trying to get away with some age related math to see how it goes.

Thank you for your script, I got the base idea. Just as an observation, that nextBirthday variable calculation doesn't take into account next year birthdays (if we are on 31st of Dec 2017 and a birthday is on 1st of Ian 2018 by example) so it needs to see how that relates to current day.

Thanks again for your reply :slight_smile:

Hi @Beda_Bogdan,

Correct (I was just discussing case number 2 in your original question).

Dates are always fun to work with. :wink: I updated the example in my answer now but as you can see the script got even more complex (so solution 1. gets even more attractive now).

Daniel

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.