Results offset with 2w interval

I am finding some strange behavior with a date histogram query with a '2w' interval. I am finding that the start date of the intervals are not corresponding with the range of my data but instead seem to be determined by some other unknown factor. At the moment it seems like perhaps the interval is off by a week but I would need to do some testing on future days to be sure. I can correct this by adding a 7d offset but I am concerned that this might not work all the time.
I am wondering what determines the start for intervals when using an odd value like '2w'. Is this possibly some sort of defect?

The reason for this is that 1w is interpreted specially, namely as WeekOfWeekyear.
The value 2w is just interpreted as an interval of 27246060*1000 milliseconds. The start for numerics intervals is determined by the smallest value found. You can change this using the extended_bounds mechanism, see also the discussion here.

Thanks for the response. I got hopeful that I could get some different results either by setting the interval to '14d' or by using the extended_bounds (though I had tried that before). Unfortunately, I can't seem to get the results to start from the beginning of my timeframe. Just to be clear I am searching results over a 4 week timeframe of '2-3-2017' to '3-3-2017' and I get intervals for '1-26-2017', '2-9-2017' and '2-23-2017'. If I choose an interval of '13d' that breaks whatever annoying calculations are happening behind the scenes and I then do get the first interval to start on '2-3-2017'. Please let me know if you think what I might be doing wrong in order to get a two week interval to start on a specific day.

Actually I am now seeing that using some other day interval like '12d' does not really change the situation. It seems like the intervals must be pivoting off of some fixed point but I am not sure what it is. It does not seem to be the beginning of the year since 2017 starts on a Sunday and the 2 week intervals are all on Thursdays.

Oh I bet all intervals are calculated as multiples from the Epoch, right?

can you post the full query?

Not sure you really want to see the whole query but here it is:
{
"size": 0,
"query": {
"filtered": {
"filter": {
"and": [
{
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gnip.matching_rules.tag": "conversation-wind"
}
},
{
"term": {
"purpose.campaigns.conversation-wind.match": true
}
}
]
}
}
]
}
},
{
"terms": {
"gnip.profileLocations.address.countryCode": [
"us"
]
}
},
{
"range": {
"postedTime": {
"gte": "2017-02-03",
"lt": "2017-03-03",
"time_zone": "America/New_York"
}
}
}
]
}
}
},
"aggregations": {
"national_trends": {
"date_histogram": {
"field": "postedTime",
"interval": "2w",
"min_doc_count": 0,
"time_zone": "America/New_York"
},
"aggregations": {
"hashtags": {
"terms": {
"field": "twitter_entities.hashtags.text",
"size": 0
}
},
"tweets": {
"terms": {
"field": "object.id",
"size": 10
},
"aggregations": {
"top_link": {
"top_hits": {
"_source": {
"includes": [
"object.id",
"object.link",
"object.body",
"body"
]
},
"size": 1
}
}
}
}
}
},
"states": {
"terms": {
"field": "gnip.profileLocations.address.region",
"size": 52
},
"aggregations": {
"intervals": {
"date_histogram": {
"field": "postedTime",
"min_doc_count": 0,
"interval": "2w",
"time_zone": "America/New_York"
},
"aggregations": {
"hashtags": {
"terms": {
"field": "twitter_entities.hashtags.text",
"size": 0
}
},
"tweets": {
"terms": {
"field": "object.id",
"size": 1
},
"aggregations": {
"top_link": {
"top_hits": {
"_source": {
"includes": [
"object.id",
"object.link",
"object.body",
"body"
]
},
"size": 1
}
}
}
}
}
}
}
}
},
"sort": [

]
}

That is without any extended bounds of course. Let me know if you want me to try adding some.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.