Hourly displaying of data

Hi Team,

I have log file from which I am filtering specific words and I want to create curl request that will show results based on each hours of current day.

In below screenshot, when I select Date histogram on @timestamp field and choose hourly or 1h its showing data of only that hour which is having data and not any other hour of the day. I want to show all hours of a day.

so that response of such curl request will be something like below, which then developers can use that response to plot graph from it on our frontend.

hrs. results of above query
01:00: 0
02:00: 6
03:00: 10
.
.
23:00: 45

current curl shows ,

 "buckets" : [
        {
          "key_as_string" : "2021-07-14T10:00:00.000+05:30",
          "key" : 1626237000000,
          "doc_count" : 82
        }

Thanks,

@cool999 I think I understand your request as wanting a curl request to your logs index that outputs the results of a time-based aggregation per day (one result per hour of the day).

You should be able to use an elasticsearch aggregation, e.g. using the kibana_sample_data_logs dataset, the query itself would be:

GET kibana_sample_data_logs/_search?size=0
{
  "aggs": {
    "by_day": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "hour"
      }
    }
  }
}

That would translate to a curl request of:

curl -X GET "localhost:9200/kibana_sample_data_logs/_search?size=0&pretty" -H 'Content-Type: application/json' -d'
{
  "aggs": {
    "by_day": {
      "date_histogram": {
        "field":     "timestamp",
        "calendar_interval":  "hour"
      }
    }
  }
}

I got this example directly from the elasticsearch documentation for a date histogram.
You might want to restrict the search to only return results for one day, e.g you'll have to add a query to your search:

GET kibana_sample_data_logs/_search?size=0
{
  "query": {
    "range": {
      "timestamp": {
        "gte": "now-1d/d",
        "lte": "now/d"
      }
    }
  }, 
  "aggs": {....}
}

Note that I'm linked to the latest version of the docs but you might need to change that if you're not running v7.13.

Hi @cheiligers,

Thanks for your reply. You understood it correctly. Query should return results per hours for a day but that is no happening currently.

My version is 7.4.0.

I got below output i.e index not found after running your first GET query.

{
  "error" : {
    "root_cause" : [
      {
        "type" : "index_not_found_exception",
        "reason" : "no such index [kibana_sample_data_logs]",
        "resource.type" : "index_or_alias",
        "resource.id" : "kibana_sample_data_logs",
        "index_uuid" : "_na_",
        "index" : "kibana_sample_data_logs"
      }
    ],
    "type" : "index_not_found_exception",
    "reason" : "no such index [kibana_sample_data_logs]",
    "resource.type" : "index_or_alias",
    "resource.id" : "kibana_sample_data_logs",
    "index_uuid" : "_na_",
    "index" : "kibana_sample_data_logs"
  },
  "status" : 404

I have similar curl like you, i.e same calendar_interval, same time stamp range.

GET access_server-2021*/_search?pretty
{
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp",
        "calendar_interval": "1h",
        "time_zone": "Asia/Calcutta",
        "min_doc_count": 1
      }
    }
  },
  "size": 0,
  "_source": {
    "excludes": []
  },
  "stored_fields": [
    "*"
  ],
  "script_fields": {},
  "docvalue_fields": [
    {
      "field": "@timestamp",
      "format": "date_time"
    }
  ],
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "bool": {
            "should": [
              {
                "match_phrase": {
                  "log.file.path": "/opt/access/log/access.log"
                }
              }
            ],
            "minimum_should_match": 1
          }
        },
        {
          "match_phrase": {
            "Request_URI": {
              "query": "\"/next2-isp/v1/\""
            }
          }
        },
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gt": "now-1d/d"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

which shows results as (i.e the same result what above visualisation showing)

.
.
 },
  "hits" : {
    "total" : {
      "value" : 82,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "2" : {
      "buckets" : [
        {
          "key_as_string" : "2021-07-14T10:00:00.000+05:30",
          "key" : 1626237000000,
          "doc_count" : 82
        }

above output is correct, as it matches with below output which we can retrieve directly from server

[root@ip-16-1-1-19 log]# cat access.log  | grep '/next2-isp/v1/*' | wc -l
82
[root@ip-16-1-1-19 log]#

As I said this only gives me results for only that hour (i.e 10 AM here) which has logs in it but I want response showing all 24 hrs with their hits count.

Thanks,

Oh dear, 7.4 is a lot older! Try replacing calendar_interval with interval to resolve the first issue.

You also need to set a range in your range query (a gt and lt):

"query": {
    "range": {
      "timestamp": {
        "gte": "now-1d/d",
        "lte": "now/d"
      }
    }

@cheiligers , Thanks for quick reply.

for the first example kibana_sample..., I thought this index is present by default.

Changing "calender_interval": "1h" to "interval": "1h", gives below exception,

#! Deprecation: [interval] on [date_histogram] is deprecated, use [fixed_interval] or [calendar_interval] in the future.

using range as below,

"gte": "now-1d/d",
"lte": "now/d"

shows me yesterday's and today's data

 {
          "key_as_string" : "2021-07-13T09:00:00.000+05:30",
          "key" : 1626147000000,
          "doc_count" : 7
        },
        {
          "key_as_string" : "2021-07-13T10:00:00.000+05:30",
          "key" : 1626150600000,
          "doc_count" : 16
        },
        {
          "key_as_string" : "2021-07-13T11:00:00.000+05:30",
          "key" : 1626154200000,
          "doc_count" : 82
        },
        {
          "key_as_string" : "2021-07-14T10:00:00.000+05:30",
          "key" : 1626237000000,
          "doc_count" : 82
        }

but currently I only want it for current day i.e today which this range "gt": "now-1d/d" is providing correctly (as you can see it is matching with above grep command's output). Sorry if I am not understating difference between this two time ranges correctly.

I am still not sure how to achieve above data per hour for a day.

Thanks,

You'll need the range query to get the results you want. Please see the range query docs and then more docs on using date-math

Hi @cheiligers, Thanks for your reply.

Say i may using something wrong or missing something in curl request which is why I am not getting the desired results but as you know this curl I got from above visualization (I.e not written by me)

My question is why I am not getting response of all the hrs by default when I am choosing hourly interval? (See above Visualization screen shot for all options I have selected.)
Say, If in visualization I am getting such results, then the curl call behind it will also show same results (I.e by each hrs) and I don't have to change anything in it.

So is there any way I can achieve this in Visulization I.e by selecting Hourly as Minimum interval (or any other setting if I am missing something) it will show results for all hrs with their values.

Thanks,

Hi All,

Can someone please reply.

Hi All,

I will try to say this in another way and sorry I should have show below screenshot instead of above,

As you can see I have selected interval as 1h but its not showing all the hours i.e from 0,1,2,3... 23 with their values (i.e if there is no requests for that hour it should display 0 for that hour), In the graph, it is saying @timestamp per hour but displaying it with the duration of 3 hrs in X-axis. It is only showing for the hour where there are hits.

  1. How can we display all the hours of a day with their values in visualisation.

I basically wanted to show graph of total hits segregated by hrs for a day in our frontend but because currently it is showing only for the hour that has data in it, it is becoming challenging.

Thanks,

  1. How can we display all the hours of a day with their values in visualisation.

There's a very detailed guide on creating this visualization in an older discuss post that might be just what you're looking for! The advice and guidance in that post was given by one of our visualizations developers and I'd struggle to give better advice than that! You'll need to use a different visualization though, using the Time Series Visual Builder (abbreviated as TSVB).

Note that using TSVB is a bit of a learning curve but it is a very powerful visualization builder. There's more info on using it here

@cheiligers, Thanks for your continue reply.

I am able to get the desired results by creating scripted field as doc['@timestamp'].value.hourOfDay and in dashboard using histogram aggregation on that scripted field and by selecting option Show empty buckets and Extends bounds between 0 to 23.

Thanks,