Sql query, histogram interval one month problem

I am trying to do histogram in sql query for month and it does not work

daily or hourly works fine here is query

GET _sql?format=txt
{
  "query": """
        select histogram(achieved, INTERVAL 1 month) as m, count(*) as c from "resource-*" where  
      achieved between '2019-01-01' and  '2019-12-31' 
        group by m
        """
}  

Output comes out with middle of the month. and first one is even for 2018. Where am I wrong?

2018-12-15T00:00:00.000Z|2959402
2019-01-14T00:00:00.000Z|7444959
2019-02-13T00:00:00.000Z|8592928

FYI using 7.5.0

@elasticforme can you post the ES query dsl for that query, please?
Use translate API for this. https://www.elastic.co/guide/en/elasticsearch/reference/7.x/sql-translate.html

I think I know why you get those results, but showing the actual query could demonstrate my point.

Here it is

{
  "size" : 0,
  "query" : {
    "range" : {
      "achieved" : {
        "from" : "2019-01-01",
        "to" : "2019-12-31",
        "include_lower" : true,
        "include_upper" : true,
        "boost" : 1.0
      }
    }
  },
  "_source" : false,
  "stored_fields" : "_none_",
  "aggregations" : {
    "groupby" : {
      "composite" : {
        "size" : 1000,
        "sources" : [
          {
            "3536" : {
              "date_histogram" : {
                "field" : "achieved",
                "missing_bucket" : true,
                "value_type" : "date",
                "order" : "asc",
                "fixed_interval" : "2592000000ms",
                "time_zone" : "Z"
              }
            }
          }
        ]
      }
    }
  }
}

Thank you @elasticforme.
The relevant part in the query you provided is "fixed_interval" : "2592000000ms" which represent the length of a month (30 days) in milliseconds.

Elasticsearch has two ways in which it can represent an interval in a date_histogram: a calendar interval and a fixed interval. There is a detailed explanation on the differences between the two in this section from the documentation.

In SQL we chose to represent these as fixed intervals, and the behavior you get is the result of this choice. We did make a change for YEAR to behave like a calendar interval and we could to the same for MONTH and maybe DAY.

Can you, please, open an issue in https://github.com/elastic/elasticsearch/ about an improvement to GROUP BY HISTOGRAM where the interval is 1 MONTH or 1 DAY? Thanks.

ok open the issue. didn't know how to do it. as never use git for anything.

https://github.com/elastic/elasticsearch/issues/51538

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.