I have an index ('jenkins_job_logs') that stores console logs from jenkins jobs, the index fields include the actual console data plus a unique ID for the jenkins job ('task_id'), there are 30k plus documents in the index so I am unable to extract more than 10k at a time
(am not going to modify the value of index.max_result_window)
So will use scrolling instead - but am hitting an issue
Here is the initial query
elk>cat testquery2.json
{
"_source": {
"includes" : [ "task_id" ],
"excludes" : [ "console_data", "testsuite", "os_name", "host_vm_version", "guest_vm_version" ]
},
"size": 5000,
"query": {
"match_all" : {}
}
}
When I run the query below I get 5000 matches returned, plus a scroll ID (have shortened the ID as it is 3200_ chars long)
elk>curl -XGET 'http://localhost:9200/jenkins_job_logs-2020.*/_search?scroll=1m&pretty' -H "Content-Type: application/json" -d @testquery2.json
{
"_scroll_id" : "DnF1ZXJ5VGhlbk..<3622 characters long>..ZldGNoVwAAAAAA=="
"took" : 17087,
"timed_out" : false,
"_shards" : {
"total" : 87,
"successful" : 87,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 32314,
"max_score" : 1.0,
"hits" : [
{
"_index" : "jenkins_job_logs-2020.07.28",
"_type" : "doc",
"_id" : "1419",
"_score" : 1.0,
"_source" : {
"task_id" : 187873
}
},
{
"_index" : "jenkins_job_logs-2020.07.28",
"_type" : "doc",
"_id" : "273",
"_score" : 1.0,
"_source" : {
"task_id" : 186542
}
I then plug the scroll ID above into a new search command (as per the docs) and get the error below
elk>curl -XPOST 'http://localhost:9200/_search/scroll?pretty' -H "Content-Type: application/json" -d '{"scroll" : "1m", "scroll_id" : "DnF1ZXJ5VGhlbk..<3622 characters long>..ZldGNoVwAAAAAA=="
{
"error" : {
"root_cause" : [
{
"type" : "search_context_missing_exception",
"reason" : "No search context found for id [1910847]"
},
{
"type" : "search_context_missing_exception",
"reason" : "No search context found for id [1910856]"
},