Which seemed to work better than the above query, but it is retrieving data from 2 days ago (so if it is the 16th, it is retrieving hits from the 14th. I only want data from 00:00:00 on the 15th through 23:59:59 on the 15th (or through the next 00:00:00, whatever is easiest; I am not picky about the second lost).
Finally, how would I get hits from whatever day yesterday was, last year? (If today is the 8/16 of 2015 then I would want hits from the 8/15 of 2014.)
Many thanks for any help in clearing these up for me.
After trying numerous things for several hours, I eventually started working with a new index and manually testing data. My mistake was that I could not tell that a search was not returning all results. Once I added my size constraints and then began testing date math variations again along with time zone configurations, I found a few that worked to get all data from yesterday until the end of yesterday.
With my (limited) dataset, all of the following date math expressions gave the same amount of hits and appeared to limit hits from 00:00:00 yesterday to the end of yesterday at 23:59:59. Are each of these equivalent expressions?
I can't work out if there are any question from your first post which you still need answers to but I have tried to answer the questions in your second post below. If you still have questions from your first post that you would like answered (or have other questions on date math) let me know.
the /X operator (where X is a time period like d) will round back in time to the nearest start of that time period. So /d will round back to the start of the day, whereas /M would round back to the start of the month.
So the problem here could be timezones (timezones are hard). When you ingest your data does it contain a timezone? What is the timezone of your server? What is the timezones of the results when they are displayed (could you show a snippet of some of the erroneous hits?)?
Thank you for clarifying things for me, Colin. What you said does make sense. (And after re-working through some parts of my post that I now realize were likely vague.) After your explanation of the rounding syntax, and working out a few more issues for myself, I was able to convert the majority of ~700 SQL-style queries to Elasticsearch queries to return the same set of data.
I have a few remaining concerns.
What is the native time zone and format that an Elasticsearch date field is converted into? My date field includes a timezone, but I must still specify the time zone when searching in Elasticsearch. I have been assuming it is UTC (time zone offset 00:00), but I’d like to confirm.
I use an index mapping for my date field that includes a timezone ID which I parse with Joda’s “ZZZ” format. Different documents have different time zones associated with them. (America/Los_Angeles, America/Vancouver, and so on.) The field itself retains the original date I supplied with the time zone while searching. Is there a way to see a date or timestamp field how Elasticsearch sees it? (As whatever time format a custom date field is converted to by Elasticsearch, such as its interpretation of UTC or milliseconds since the Unix epoch, etc.)
I am also having problems with getting weekly data, such as last week’s data, because I am not sure how to specify that that Sunday is the beginning of the week over Joda’s default Monday.
"gte": "now/w-1d" (read as: round to the beginning of this week (Monday) and subtract one day) may generally work unless it is currently Sunday, because then a week would be Sunday to Sunday instead of Sunday to Saturday.
All dates are indexed as a long field with a value which is the epoch time in milliseconds, so effectively the dates are converted to UTC.
If you have enough memory to load the date field into the field data cache then you could use the 'field_data fields' feature to see the indexed value, but I would only advise using that as a debugging option here and not as your final solution.
This should find the beginning of the current week for the start of the range and the beginning of the next week for the end of the range.
As far as I know there isn't an offset for range queries since you have complete control over the start and end of the range. In the date_histogram aggregation the offset parameter is used to adjust where the buckets start from but this doesn't really apply when using a single filter.
Thanks again Colin. You’ve been very helpful to me. It's too bad there aren't any quick fixes for the US week issue, but you've helped me with the tools needed to try to further debug time issues as they arise. I’ll post again if I encounter anymore issues.
One note is that I had to modify the date math from "(now+1w)/w - 1d" to "now+1w/w-1d" because Elasticsearch wouldn’t accept it (gave an error that said that it could not parse the date math). Maybe it was the parentheses? Hopefully this modification won't have an unanticipated impact on the query.