Elasticsearch query time range issue

Hello Experts,

I am not able to retrieve all the status_code values with below query. It has given only less value than actual values.

**My query** 

"_source": [ "status_code",  "referrer", "timestamp" ],
    "query": {
        "bool": {
          "should": [
            {
                "match": {
                        "referrer": ".*"
                }
            },
            {
              "range" : {
                "timestamp" : {
                  "gte": "2021-03-28T06:00:00.000Z",
                  "lt": "2021-03-29T13:15:00.000Z"
                }
              }
            }
           ],
           "must_not": [
        {
          "match": {
            "status_code": "200"
          }
        }
      ]
    }
  }

I think below range is not giving expected results.
"range" : {
                "timestamp" : {
                  "gte": "2021-03-28T06:00:00.000Z",
                  "lt": "2021-03-29T13:15:00.000Z"
                }

 Can anyone please help on this.

Regards,
Suresh

Please Read about query / field context here

First You have the filter inside the should, that is not correct it should be outside in the filter context

Once you get that correct let us know if it works

This query works as expected for me.

GET filebeat-*/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "http.response.status_code": {
              "value": "200"
            }
          }
        }
      ],
      "filter": [
        {
          "range": {
            "@timestamp": {
              "gte": "2021-03-18T02:31:33.121Z",
              "lte": "2021-03-18T02:32:33.121Z"
            }
          }
        }
      ]
    }
  }
}

Please do not format the whole text as code. It's not funny to read.

For example:

This is a text.

This is a code

Instead of

This is a text.

This is a code

Thanks.

Hi Dadoonet,

I added below query as you suggested.

Giving time between from 06:30 AM to 09:30 PM of today. But it gave the result of only 01 minute.

The result would be as below
starting point is "@timestamp" : "2021-03-29T03:04:45.809Z", and ending timestamp is "@timestamp" : "2021-03-29T03:04:49.818Z",

curl -X GET "http://elasticsearch_ip:port/qa-weblogs-*/_search?track_total_hits=true&rest_total_hits_as_int=true&pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "response_code": {
              "value": "200"
            }
          }
        }
      ],
      "filter": [
        {
          "range": {
            "@timestamp": {
              "gte": "2021-03-29T01:00:00.121Z",
              "lte": "2021-03-29T16:00:00.121Z"
            }
          }
        }
      ]
    }
   }
}
'

Please let me know what i need to do

Please provide a set of sample data that shows some returned and show a sample that was not returned

Hi Stephenb,

Below is the output , but hits is "total" : 750755, the same records I am not able to retrieve, there are only 9 records available. Also am getting only 1 minute data.

Please help on this.

{
"took" : 1548,
"timed_out" : false,
"_shards" : {
"total" : 19,
"successful" : 19,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 750755,
"max_score" : 1.0,
"hits" : [
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "8SCDMDNFDXCSDFNDJFDJF",
"_score" : 1.0,
"_source" : {
"request" : "value1",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T13:26:57.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "SDSDSDHSDHSHDHS",
"_score" : 1.0,
"_source" : {
"request" : "value2",
"referrer" : "https://www.google.com/",
"response_code" : 304,
"timestamp" : "2021-03-28T13:26:57.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "SDSDJmscnsnsm",
"_score" : 1.0,
"_source" : {
"request" : "value3",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T13:26:57.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "asdssdnsnsjnADNJSDA",
"_score" : 1.0,
"_source" : {
"request" : "value3",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T13:26:58.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "sdfnsdjfsdjfsdjf",
"_score" : 1.0,
"_source" : {
"request" : "value4",
"referrer" : "-",
"response_code" : 301,
"timestamp" : "2021-03-28T13:26:58.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "sdnasjdsjadasj",
"_score" : 1.0,
"_source" : {
"request" : "value5",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T13:26:58.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "ASDSBSNBFNBF",
"_score" : 1.0,
"_source" : {
"request" : "value6",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T13:26:59.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "sdnsdnsdbvnbvn",
"_score" : 1.0,
"_source" : {
"request" : "value7",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T13:26:57.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "ASNSNCBSNBSNB",
"_score" : 1.0,
"_source" : {
"request" : "value8",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T13:27:01.000Z"
}
},
{
"_index" : "qa-web-logs-2021.03.28",
"_type" : "_doc",
"_id" : "SSFSNFBSNFSN",
"_score" : 1.0,
"_source" : {
"request" : "value9",
"referrer" : "-",
"response_code" : 304,
"timestamp" : "2021-03-28T12:03:27.000Z"
}
}
]
}
}

So what I need to help

Is the following

if you take the time to do this and please take the time to format ,(which you just did not) then perhaps we can figure it out.

If you do not take the time to provide what I am asking then I / we can not help.

  1. Provide 3 source documents 2 that are returned and 1 source document you think should be returned but is not.

  2. Provide the exact query constructed as we instructed.(you are still not showing the correct query / filter context.

  3. Provide the results from above showing the 2 returned and the 1 not.

I need all this to reproduce your issue.

Provide all these clearly and perhaps we can solve this... Time range filter queries certainly work.

BTW searches only return 10 results by default. which is what you see above.

if you want more you need to set size or use paging.

See here

Have you tried to set

size : 100

1 Like

Hi Stephenb,

Thanks for your suggestion, i did not set paging size. After setting I am able to get up to 10000 records.

But here response_code crossed more than 10000 (approximately 2lakh hits). So i am getting below error, and working on this. Please suggest me to fix if any reference. Thank you so much.

"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Result window is too large, from + size must be less than or equal to: [10000] but was [25005]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
},

You can use:

  • the size and from parameters to display by default up to 10000 records to your users. If you want to change this limit, you can change index.max_result_window setting but be aware of the consequences (ie memory).
  • the search after feature to do deep pagination.
  • the Scroll API if you want to extract a resultset to be consumed by another tool later.

Hi Dadoonet,

I did by setting index.max_result_window = 300000. As you said I am facing memory issue.

Regards,
Suresh

Ahhh making progress....

Are you really trying to paginate through 300000 results or are you trying to export the data?

What are you actually trying to accomplish?

Are you trying to export the data?

Exactly what I was trying before getting data from elasticsearch .

  1. I have created a visualization with required fields using Data Table visualization.

2 ) It gives Export: Raw and Formatted options at the left bottom.
This I want to download through API, but Unfortunately, we don't have a publicly available API for exporting visualizations. said by [majagrubic] [Elastic Team Member].

So I am planning to retrieve the same from elastisearch API and segregate all the fields by writing Java program.

The team needs error count daily with attached mail that should be automate.

Please suggest me whether it is the right approach.

Regards,
Suresh

Ok a couple things going on there.

  1. That data table would probably not download 300000 rows either.
    Can you provide a screenshot of that data table?

  2. First if you want to know counts like Number of errors yesterday, number of 200s, 300s, 400s etc then you should write an aggregation not just a plain search... do you really need the RAW data?

Or are you looking for the counts of response codes?

What version of Elastic are you using and what License level... so we can provide some options.

Hi Stephenb,

  1. Due to cyber security reason I am not able to attach complete screen shot, but attaching count. This I would like to download through API or script then I no need to write elasticsearch API if it works.
    2 ) We do not want count, we need complete data other than 200 count along with some fields like referrer, request and other fields.

  2. If point1 works with API then i do not work on point2.

Regards,
Suresh

What version of Elastic are you using and what License level... so we can provide some options. -> 7.9.0 and Basic license.

So that example shows 18,687 docs that is far different than 300,000 docs...

So you are are going to attach 20,000 to 300,000 results to an email... hmm OK.

So this is an export problem ... and thus you will need to use the APIs that @dadoonet referenced above...

You also may consider using bigger data nodes with more RAM and JVM Heap Size.

1 Like

Hi Stephenb,

I tried using search _after as suggested.
I guess it required x-pack license as per doc Point in time API | Elasticsearch Reference [7.12] | Elastic

To get PIT ID I executed below command, but getting error.
curl -X POST "http://ip:port/dev2-applogs-be-*/_pit?keep_alive=1m&pretty"
Getting below error.

Regards,
Suresh

Please don't post images of text as they are hard to read, may not display correctly for everyone, and are not searchable.

Instead, paste the text and format it with </> icon or pairs of triple backticks (```), and check the preview window to make sure it's properly formatted before posting it. This makes it more likely that your question will receive a useful answer.

It would be great if you could update your post to solve this.

It requires a basic license which is the default free built-in license available.

What is the version you are using? What gives:

GET /

Hi Dadoonet,
Version is 7.9.0

Sure, I will make the image format properly.

To get PIT ID, executed below command., but getting request body is required.

curl -X GET "http://ip:port/dev2-applogs-be-*/_pit?keep_alive=1m&pretty"

output
{
"error" : {
"root_cause" : [
{
"type" : "parse_exception",
"reason" : "request body is required"
}
],
"type" : "parse_exception",
"reason" : "request body is required"
},
"status" : 400
}