About aggregations request with scroll, only the first results can be output?

Pelin_li · January 27, 2018, 12:11pm

Version

Elasticsearch:

"version" : {
    "number" : "5.0.0",
    "build_hash" : "253032b",
    "build_date" : "2016-10-26T05:11:34.737Z",
    "build_snapshot" : false,
    "lucene_version" : "6.2.0"
  }

Java:

openjdk version "1.8.0_102"

OS:

Linux 10-10-166-129 3.13.0-46-generic #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Docker(It's a old version):

Docker version 1.12.2, build bb80604

Also, we are using elasticsearch:5 docker image

Description

What i want

I want to search 1 hour data from elasticsearch by Range and Scroll.

What my steps

First
I Used Range And Scroll search the first 10000 size data

GET logstash-2018.01.22/_search?scroll=1m
{
    "query": {
        "bool": {
            "must": [{
                "range": {
                    "time_iso8601": {
                        "gte": 1516636800000,
                        "lte": 1516658400000,
                        "format": "epoch_millis"
                    }
                }
            }]
        }
    },
    "size": 10000
}

And i got the data:

{
    "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAZtkCFmdORS1zNU9RVExxOVZ6VGJKTEtBcFEAAAAAAJdFnxZFeVVudDVaM1RJLW9pUWI4WkpQR3BRAAAAAABm2QMWZ05FLXM1T1FUTHE5VnpUYkpMS0FwUQAAAAAAE7BuFnQ2aFh2aVBQVDlpR3dSc1ppa1Uza2cAAAAAAJdFoBZFeVVudDVaM1RJLW9pUWI4WkpQR3BR",
    "took": 3354,
    "timed_out": false,
    "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
    },
    "hits": {
      "total": 506943,
      "max_score": 1,
      "hits": [
        {
          "_index": "logstash-2018.01.22",
          "_type": "logs",
          "_id": "AWEe4DVE5NOBHBeKKltZ"
        }]
    }
    ................................
    ................................
    ................................
    ................................
    ................................
    ................................
}

Second
Use scroll_id to search

GET /_search/scroll
{
    "scroll": "1m",
    "scroll_id": "{DnF1ZXJ5VGhlbkZldGNoBQAAAAAAZtkCFmdORS1zNU9RVExxOVZ6VGJKTEtBcFEAAAAAAJdFnxZFeVVudDVaM1RJLW9pUWI4WkpQR3BRAAAAAABm2QMWZ05FLXM1T1FUTHE5VnpUYkpMS0FwUQAAAAAAE7BuFnQ2aFh2aVBQVDlpR3dSc1ppa1Uza2cAAAAAAJdFoBZFeVVudDVaM1RJLW9pUWI4WkpQR3BR}"
}

And i got the data:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Cannot parse scroll id"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Cannot parse scroll id",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Illegal base64 character 7b"
    }
  },
  "status": 400
}

I don't know if it is a question of my operation. I saw a passage on this page: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html:

If the request specifies aggregations, only the initial search response will contain the aggregations results.

dadoonet · January 27, 2018, 12:13pm

It's not a bug IMHO.

It'd be useless when you want to extract million of records to compute again and again the aggregations as you already got the result once when you started the extraction.

Pelin_li · January 27, 2018, 12:21pm

Thanks for your reply.

I sorry that i'm not very understanding of the meaning. It means that after the first search with range, i can't use scroll_id continue to the next 10000 size of data?

dadoonet · January 27, 2018, 1:20pm

Sorry. I misread and was confused by the title which does not reflect the question.

Anyway this should be:

GET /_search/scroll
{
   "scroll": "1m",
   "scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAZtkCFmdORS1zNU9RVExxOVZ6VGJKTEtBcFEAAAAAAJdFnxZFeVVudDVaM1RJLW9pUWI4WkpQR3BRAAAAAABm2QMWZ05FLXM1T1FUTHE5VnpUYkpMS0FwUQAAAAAAE7BuFnQ2aFh2aVBQVDlpR3dSc1ppa1Uza2cAAAAAAJdFoBZFeVVudDVaM1RJLW9pUWI4WkpQR3BR"
}

Pelin_li · January 27, 2018, 1:30pm

Oh, i'm very sorry. I do the wrong search, my search /_search/scroll with scroll_id has extra { and }

I'm sorry again about my carelessness. And thanks for your patient!

system · February 24, 2018, 1:30pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch query based on timestamp from kibana (dev tools) Elasticsearch docker , ilm-index-lifecycle-management	2	393	September 13, 2023
Confused about why scroll api doesn't seem to function Elasticsearch	7	843	July 5, 2017
An issue , in implmenting the scroll function Elasticsearch docker	1	177	July 10, 2023
Scrolling / sorting Elasticsearch	6	3268	July 6, 2017
Scroll Search Bug? Elasticsearch	4	2605	July 6, 2017

About aggregations request with scroll, only the first results can be output?

Version

Description

What i want

What my steps

Related topics