Get all results at once in Elasticsearch SQL syntax

legendu · October 25, 2018, 8:09am

I have a simple SQL query in Elasticsearch which I know returns less than 100 rows of results. How can I get all these results at once (i.e., without using scroll)? I tried the limit n clause but it works when n is less than or equal to 10 but doesn't work when n is great than 10.

fkelbert · October 25, 2018, 1:02pm

According to the documentation, you should be able to use LIMIT ALL. Does this work?

legendu · October 25, 2018, 11:32pm

I tried. It didn't work. It still returned 10 rows of result. Perhaps I should file a ticket on this?

nik9000 · October 25, 2018, 11:46pm

If you fo it in the SQL console you'll get everything. I'd you do it in jdbc you'll get everything. If you do it over the http API you need to use the cursor that it returns. It is documented on the SQL API page. I'm on mobile so no link, sorry. Jdbc and the SQL cli all use the cursor mechanism over the http API.

Andrei_Stefan · October 26, 2018, 8:20am

@legendu how do you run the query and, also, please share the query command (the full JSON if it's the REST API, or java code if it's JDBC). Thanks.

legendu · October 28, 2018, 9:08am

@nik9000
Thank you very much for the explanation! I used the http API. Now everything makes sense to me.

But how can I use the scroll API to get all results from a SQL query using http API? The scroll API returns scroll_ids while the SQL query returns a cursor ID. I tried to feed the cursor ID returned by a SQL query into a scroll API but it failed to work.

legendu · October 28, 2018, 9:12am

@Andrei_Stefan Please see my Python code for querying ES using requests below.

import requests
import json

url = 'http://10.204.61.127:9200/_xpack/sql'
headers = {
   'Content-Type': 'application/json',
}
query = {
   'query': '''
      select
          date_start,
          sum(spend) as spend
       from
           some_index
       where
           campaign_id = 790
           or
           campaign_id = 490
       group by
           date_start
   '''
}
response = requests.post(url, headers=headers, data=json.dumps(query))

Andrei_Stefan · October 29, 2018, 8:09am

@legendu the cursor ID you get from the SQL API needs to be feed back in the same SQL API to get the next batch of results. Snippet from our documentation:

You can continue to the next page by sending back the cursor field. In case of text format the cursor is returned as Cursor http header.

POST /_xpack/sql?format=json { "cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWYUpOYklQMHhRUEtld3RsNnFtYU1hQQ==:BAFmBGRhdGUBZgVsaWtlcwFzB21lc3NhZ2UBZgR1c2Vy9f///w8=" }

legendu · October 29, 2018, 2:03pm

@Andrei_Stefan Thank you, Andrei! I had a try but it still gave me at most 10 rows of results. The second call on the cursor returns no result at all. I'm sure that there are more than 10 rows of results in the index. Any idea what went wrong?

And BTW, do you know the equivalent native syntax of the SQL query above? I'm so to ask for this but I'm very new to ES (and that's the reason I tried the SQL API). I'd like to verify that the equivalent native query returns the right answer.

Andrei_Stefan · October 30, 2018, 7:09am

Take the query above and send it to translate API: SQL Translate API | Elasticsearch Guide [master] | Elastic. It will give the native ES query that you can use to test.

legendu · November 4, 2018, 11:40am

I tried the native query returned by the SQL translate API on the SQL query with limit all. However, it still returns only 10 rows of results. How can I make it return all results (less than 100 rows in total)?

import requests
import json


url = 'http://10.204.61.127:9200/some_index/some_doc/_search'
headers = {
    'Content-Type': 'application/json',
}
query = {
    "size": 0,
    "query": {
        "bool": {
            "should": [
                {
                    "term": {
                        "campaign_id.keyword": {
                            "value": 790,
                            "boost": 1.0
                        }
                    }
                },
                {
                    "term": {
                        "campaign_id.keyword": {
                            "value": 490,
                            "boost": 1.0
                        }
                    }
                }
            ],
            "adjust_pure_negative": True,
            "boost": 1.0
        }
    },
    "_source": False,
    "stored_fields": "_none_",
    "aggregations": {
        "groupby": {
            "composite": {
                "size": 1000,
                "sources": [
                    {
                        "2735": {
                            "terms": {
                                "field": "date_start",
                                "missing_bucket": False,
                                "order": "asc"
                            }
                        }
                    }
                ]
            },
            "aggregations": {
                "2768": {
                    "sum": {
                        "field": "spend"
                    }
                }
            }
        }
    }
}
response = requests.post(url, headers=headers, data=json.dumps(query)).json()

Andrei_Stefan · November 5, 2018, 1:56pm

If you used that ES DSL and you got back 10 records... maybe that is what you have in the index that matches that query. Please, note that the query uses "size": 10 which means the results will contain the aggregations results only, which is expected. Also, the composite aggregation there has "size": 1000 which means the result will contain first 1000 buckets.

So, if you say there are more than 10 rows you need to get back, this query (and its aggregation) should get you back all of it. To me, this means you have 10 buckets to be returned. Or maybe you mixed up number of documents with number of buckets? (you are using a group by in there, that's why you have "size": 0 and a composite aggregation in the query)

system · December 3, 2018, 2:04pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use pagination in Elastic Search SQL? Elasticsearch	5	4912	August 16, 2019
ElasticSearch SQL pagination and format Elasticsearch elastic-stack-sql	3	2748	July 23, 2018
How to get all the outputs in one run using sql api? Elasticsearch	7	456	July 16, 2019
Elasticsearch SQL retrieve more than 1000 Elasticsearch elastic-stack-sql	2	5488	October 30, 2020
How to get data in chunks \| need for pagination Elasticsearch elastic-stack-sql	4	1566	February 14, 2019

Get all results at once in Elasticsearch SQL syntax

Related topics