chivas
March 30, 2022, 7:38am
1
Hi,
I am trying to get a count of the requests hitting my server by time range, and have come up with below query:
GET /logs-prod*/_search?pretty=true
{
"query": {
"bool": {
"filter": [
{
"match_phrase": {
"type": "REQ"
}
},
{
"range": {
"timestamp": {
"gte": "2022-03-29T01:00:00.000Z",
"lte": "2022-03-29T15:00:00.000Z"
}
}
}
]
}
},
"aggs": {
"txn-count": {
"terms": {
"script": "if (doc[\"timestamp\"].size() == 0) return '';else if (doc[\"timestamp\"].value.getHour() >= 1 && doc[\"timestamp\"].value.getHour() <= 14 ) return \"Peak Transactions\"; else return \"Off-Peak Transactions\""
}
}
}
}
However, the result is coming as below:
"aggregations" : {
"txn-count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Peak Transactions",
"doc_count" : 164783
}
]
}
}
}
i.e. everything is just being grouped into Peak Transactions, what am I doing wrong?
Thanks in advance!
spinscale
(Alexander Reelsen)
March 30, 2022, 7:54am
2
Hey, you are filtering from 1am till 3pm in your bool
query with a range
filter so, with the exception of documents being indexed exactly at 3pm, all of your docs fall into Peak Transactions anyway?
chivas
March 30, 2022, 8:24am
3
@spinscale does not seem so, please see below:
GET /logs-prod*/_count
{
"query": {
"bool": {
"filter": [
{
"match_phrase": {
"type": "REQ"
}
},
{
"range": {
"timestamp": {
"format": "strict_date_optional_time",
"gte": "2022-03-29T01:00:00.000Z",
"lte": "2022-03-29T14:00:00.000Z"
}
}
}
]
}
}
}
}
{
"count" : 161655,
"_shards" : {
"total" : 211,
"successful" : 211,
"skipped" : 0,
"failed" : 0
}
}
Thanks.
spinscale
(Alexander Reelsen)
March 30, 2022, 8:48am
4
Sorry, but I fail to understand what this should prove compared to what I said above?
chivas
March 30, 2022, 8:57am
5
@spinscale since I am using the condition,
if (doc[\"timestamp\"].value.getHour() >= 1 && doc[\"timestamp\"].value.getHour() <= 14 ) return \"Peak Transactions\"
in the script, shouldn't 161655 documents fall under Peak Transactions?
Thanks!
spinscale
(Alexander Reelsen)
March 31, 2022, 11:48am
6
can you share the full aggregation response, and also add the show_term_doc_count_error
parameter. See Terms aggregation | Elasticsearch Guide [8.1] | Elastic
chivas
March 31, 2022, 12:24pm
7
@spinscale sure, here you go, I executed below query (hope I used the parameter correctly)
GET /logs-prod*/_search?pretty=true
{
"query": {
"bool": {
"filter": [
{
"match_phrase": {
"type": "REQ"
}
},
{
"range": {
"timestamp": {
"gte": "2022-03-29T01:00:00.000Z",
"lte": "2022-03-29T15:00:00.000Z"
}
}
}
]
}
},
"size": 0,
"aggs": {
"txn-count": {
"terms": {
"show_term_doc_count_error": true,
"script": "if (doc[\"timestamp\"].size() == 0) return '';else if (doc[\"timestamp\"].value.getHour() >= 1 && doc[\"timestamp\"].value.getHour() <= 14 ) return \"Peak Transactions\"; else return \"Off-Peak Transactions\""
}
}
}
}
Output is below
{
"took" : 222,
"timed_out" : false,
"_shards" : {
"total" : 212,
"successful" : 212,
"skipped" : 211,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"txn-count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Peak Transactions",
"doc_count" : 164783,
"doc_count_error_upper_bound" : 0
}
]
}
}
}
Thanks.
spinscale
(Alexander Reelsen)
March 31, 2022, 3:46pm
8
Can you enable tracking of total hits Search your data | Elasticsearch Guide [8.1] | Elastic - not yet sure about that discrepancy... numbers are supposed to be exact as only one shard seems to be involved
chivas
March 31, 2022, 3:51pm
9
@spinscale
GET /logs-prod*/_search?pretty=true
{
"track_total_hits": true,
"query": {
"bool": {
"filter": [
{
"match_phrase": {
"type": "REQ"
}
},
{
"range": {
"timestamp": {
"gte": "2022-03-29T01:00:00.000Z",
"lte": "2022-03-29T15:00:00.000Z"
}
}
}
]
}
},
"size": 0,
"aggs": {
"txn-count": {
"terms": {
"show_term_doc_count_error": true,
"script": "if (doc[\"timestamp\"].size() == 0) return '';else if (doc[\"timestamp\"].value.getHour() >= 1 && doc[\"timestamp\"].value.getHour() <= 14 ) return \"Peak Transactions\"; else return \"Off-Peak Transactions\""
}
}
}
}
Output
{
"took" : 14,
"timed_out" : false,
"_shards" : {
"total" : 212,
"successful" : 212,
"skipped" : 211,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 164783,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"txn-count" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "Peak Transactions",
"doc_count" : 164783,
"doc_count_error_upper_bound" : 0
}
]
}
}
}
spinscale
(Alexander Reelsen)
April 4, 2022, 9:51am
10
So, the query returns 164783
hits as well as the aggregation. That means that part looks good to me.
Above you stated, that 161655
are returned. Why isn't that the case here, when you are using the same query against the same indices?
system
(system)
Closed
May 2, 2022, 9:52am
11
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.