Birthday in Elasticsearch


(Arpan Sahoo) #1

I have a Birthday date-field (format: MMddyyyy||MMdd) in my index. I want to search for exact birthday that a user search for (eg. 03221989) and the upcoming birthdays. I am able to get the exact birthday. But for upcoming birthdays, i tried:

  1. Range query - "gte" : "now" -> it won't work as now will also have a year field and I want to find 03221989 type Birthdays as well
  2. Range query - "gte" : "03221989" -> with this i am able to sort the records in ascending order of Month
    Suppose in my index, I have 3 records as:
    "Birthday": "03221979"
    "Birthday": "05271988"
    "Birthday": "04161990"

I want the elasticsearch query to return me in ascending order of month irrespective of year. Return data should be:
"Birthday": "03221979"
"Birthday": "04161990"
"Birthday": "05271988"

Kindly someone help me on this..


Elasticsearch Birthday field
(Eduardo González de la Herrán) #2

Hi @arpansahoo,

I guess that what you are trying to achieve is not that easy or straightforward.
Please take a look to these resources and let me know your thoughts:

My suggestion would be to try to add a new field to your index called 'next_birthday', calculated and daily updated via a periodic job, as @danielmitterdorfer was suggesting in the shared resource. If you can manage that, then your queries would be simple and straightforward.

About the data you are working with, and having a date field with format MMddyyyy || MMdd, I wouldn't recommend to keep dates with both year and without year provided in the same date-field, as I believe it won't be giving you any extra value (internally all dates will have a year attached, so, omitting the year in some documents you wouldn't be able to make sense out of the data).

In such case, I would recommend to store the fields separately, as it's suggested here:

The third option mentioned before (storing as text and using wildcards in queries) could be valid if you don't have a big amount of users. Just keep an eye on the performance if you follow this approach.

Btw, how have you implemented the "search for exact birthday" query with your current data set?

Best regards,
Eduardo


(Arpan Sahoo) #3

Thanks a lot for your suggestions @eedugon ...

For Exact Birthday, i have taken another text field inside Birthday and performing a match query on it. I have also added a edge ngram type Tokenizer-Analyzer to get 8 character from a input. I think this is bit consusing, below is my code:
Index Time:-
"tokenizer": {
"birthday_tokenizer": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "8",
"type": "edge_ngram",
"max_gram": "8"
}
}
}

"analyzer": {
"birthday_analyzer": {
"type": "custom",
"tokenizer": "birthday_tokenizer"
}
}

"Birthday": {
"type": "date",
"fields": {
"Exact": {
"type": "text",
"analyzer": "whitespace",
"search_analyzer": "birthday_analyzer"
}
}

Query:-
"query": {
"bool": {
"should": [
{
"match": {
"Birthday.Exact": {
"query": "AnyFirstNameOrEmail 05271989",
"boost": 2
}
}
}
]
}
}

Kindly let me know if you have any improvement point for above code as well, It will help me in refining my code :slight_smile: ...


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.