Elasticsearch with Java -: Use scroll API for queries written in JSON format

I have a query taken from kibana in JSON format. I have a java code where I am able to fire this same query using RestClient which is working fine.
Now the data being fetched is huge and so I want to use the scroll API to fetch the data in chunks.
The examples that I could find for scroll API is using RestHighLevelClient. But the problem is my query is in JSON format due to which it is giving exception as "no [query] registered for "[aggs]]"

There could be two solutions to my problem

  1. Connect to ES with RestClient, use scroll API and pass my query in JSON to it and get the output.
  2. Connect to ES with RestHighLevelClient, use scroll API and pass my query in JSON format and get the output

At present both the above approach are not working. Any help or lead will be very helpful

You can't use aggs in a scroll query. What is your query/code look like exactly?

String query = "";
     
    query ="\"\n" +
            "\n" +
            "{\n" +
            "   \"aggs\": {\n" +
            "      \"term1\": {\n" +
            "         \"terms\": {\n" +
            "            \"field\": \"term1.keyword\",\n" +
            "            \"size\": 500000,\n" +
            "            \"order\": {\n" +
            "               \"1\": \"desc\"\n" +
            "            }\n" +
            "         },\n" +
            "         \"aggs\": {\n" +
            "            \"1\": {\n" +
            "               \"cardinality\": {\n" +
            "                  \"field\": \"term2\"\n" +
            "               }\n" +
            "            },\n" +
            "            \"term3\": {\n" +
            "               \"terms\": {\n" +
            "                  \"field\": \"term4\",\n" +
            "                  \"size\": 120,\n" +
            "                  \"order\": {\n" +
            "                     \"1\": \"asc\"\n" +
            "                  }\n" +
            "               },\n" +
            "               \"aggs\": {\n" +
            "                  \"1\": {\n" +
            "                     \"cardinality\": {\n" +
            "                        \"field\": \"term2\"\n" +
            "                     }\n" +
            "                  },\n" +
            "                  \"term5\": {\n" +
            "                     \"terms\": {\n" +
            "                        \"field\": \"term6\",\n" +
            "                        \"size\": 1,\n" +
            "                        \"order\": {\n" +
            "                           \"term5-orderAgg\": \"desc\"\n" +
            "                        }\n" +
            "                     },\n" +
            "                     \"aggs\": {\n" +
            "                        \"1\": {\n" +
            "                           \"cardinality\": {\n" +
            "                              \"field\": \"term2\"\n" +
            "                           }\n" +
            "                        },\n" +
            "                        \"term5-orderAgg\": {\n" +
            "                           \"max\": {\n" +
            "                              \"field\": \"term7\"\n" +
            "                           }\n" +
            "                        }\n" +
            "                     }\n" +
            "                  }\n" +
            "               }\n" +
            "            }\n" +
            "         }\n" +
            "      }\n" +
            "   },\n" +
            "   \"size\": 0,\n" +
            "   \"_source\": {\n" +
            "      \"excludes\": []\n" +
            "   },\n" +
            "   \"stored_fields\": [\n" +
            "      \"*\"\n" +
            "   ],\n" +
            "   \"script_fields\": {},\n" +
            "   \"docvalue_fields\": [\n" +
            "      {\n" +
            "         \"field\": \"@timestamp\",\n" +
            "         \"format\": \"date_time\"\n" +
            "      },\n" +
            "      {\n" +
            "         \"field\": \"Timer_val\",\n" +
            "         \"format\": \"date_time\"\n" +
            "      },\n" +
            "      {\n" +
            "         \"field\": \"Total_time\",\n" +
            "         \"format\": \"date_time\"\n" +
            "      }\n" +
            "   ],\n" +
            "   \"query\": {\n" +
            "      \"bool\": {\n" +
            "         \"must\": [\n" +
            "            {\n" +
            "               \"match_all\": {}\n" +
            "            },\n" +
            "            {\n" +
            "               \"range\": {\n" +
            "                  \"@timestamp\": {\n" +
            "                     \"gte\": 1522235348426,\n" +
            "                     \"lte\": 1553771348426,\n" +
            "                     \"format\": \"epoch_millis\"\n" +
            "                  }\n" +
            "               }\n" +
            "            },\n" +
            "            {\n" +
            "               \"match_phrase\": {\n" +
            "                  \"term8.keyword\": {\n" +
            "                     \"query\": \"term9\"\n" +
            "                  }\n" +
            "               }\n" +
            "            }\n" +
            "         ],\n" +
            "         \"filter\": [],\n" +
            "         \"should\": [],\n" +
            "         \"must_not\": []\n" +
            "      }\n" +
            "   }\n" +
            "}\n" +
            "\"";

So just keep the query part and remove anything else.

above query in structured format

{
aggs: {
term1: {
terms: {
field: "term1.keyword",
size: 500000,
order: {
1: desc
}
},
aggs: {
1: {
cardinality: {
field: "term2"
}
},
term3: {
terms: {
field: "term4",
size: 120,
order: {
1: asc
}
},
aggs: {
1: {
cardinality: {
field: "term2"
}
},
term5: {
terms: {
field: "term6",
size: 1,
order: {
term5-orderAgg: desc
}
},
aggs: {
1: {
cardinality: {
field: "term2"
}
},
term5-orderAgg: {
max: {
field: "term7"
}
}
}
}
}
}
}
}
},
size: 0,
_source: {
excludes:
},
stored_fields: [
*
],
script_fields: {},
docvalue_fields: [
{
field: @timestamp,
format: date_time
},
{
field: "Timer_val",
format: date_time
},
{
field: "Total_time",
format: date_time
}
],
query: {
bool: {
must: [
{
match_all: {}
},
{
range: {
@timestamp: {
gte: 1522235348426,
lte: 1553771348426,
format: epoch_millis
}
}
},
{
match_phrase: {
term8.keyword: {
query: "term9"
}
}
}
],
filter: ,
should: ,
must_not:
}
}
}

just keep the query part and remove anything else.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.