SearchScroll hangs when dealing with a big amount of hits

piotrs · May 14, 2012, 1:28pm

Hi,

I'm using Java API to index/search in elasticsearch. I noticed that
when search results are extremely big (for instance more than 200.000
hits), scroll hangs after it has iterated about 50.000 records.

For instance I have following code to produce a search response:

SearchResponse scrollResp =
client.prepareSearch().setIndices(indexName).setTypes(ES_TYPE)
.setSearchType(SearchType.SCAN).setQuery(buildQuery)
.setScroll(new TimeValue(1000))
.setSize(1000).execute().actionGet();

and then I do following to scroll through the result

while (true) {
scrollResp =
client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(
new TimeValue(1000)).execute().actionGet();
boolean hitsRead = false;
for (SearchHit hit : scrollResp.getHits()) {
hitsRead = true;
def sourceMap = hit.getSource();
sourceMap._id = hit.getId();
resultSet.result.add(sourceMap);
}
if (!hitsRead) {
break;
}
}

The prepareSearchScroll() call tends to hang or even the for loop when
reading the next record. Can't elasticsearch handle such big amount of
hits and should the search be more restricted to produce less amount
of hits?

br, Piotr

kimchy · May 15, 2012, 8:46pm

Scan is aimed at scrolling large amount of data. Do you see any failures
the logs in the cluster nodes? Iterating through the hits
will definitely not hang, since all are already represented in the search
response, maybe your client does not have enough mem allocated to it to do
the scrolling? Try and use a smaller size (like 100), see if it helps.

On Mon, May 14, 2012 at 4:28 PM, piotrs piotr.skawinski@gmail.com wrote:

Hi,

I'm using Java API to index/search in elasticsearch. I noticed that
when search results are extremely big (for instance more than 200.000
hits), scroll hangs after it has iterated about 50.000 records.

For instance I have following code to produce a search response:

SearchResponse scrollResp =
client.prepareSearch().setIndices(indexName).setTypes(ES_TYPE)
.setSearchType(SearchType.SCAN).setQuery(buildQuery)
.setScroll(new TimeValue(1000))
.setSize(1000).execute().actionGet();

and then I do following to scroll through the result

while (true) {
scrollResp =
client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(
new TimeValue(1000)).execute().actionGet();
boolean hitsRead = false;
for (SearchHit hit : scrollResp.getHits()) {
hitsRead = true;
def sourceMap = hit.getSource();
sourceMap._id = hit.getId();
resultSet.result.add(sourceMap);
}
if (!hitsRead) {
break;
}
}

The prepareSearchScroll() call tends to hang or even the for loop when
reading the next record. Can't elasticsearch handle such big amount of
hits and should the search be more restricted to produce less amount
of hits?

br, Piotr

Topic		Replies	Views
How do I reduce scroll response time? Elasticsearch	1	500	July 6, 2017
Missing result using scroll in java API Elasticsearch	10	1281	May 4, 2018
Scroll vs Search API Elasticsearch	7	10641	July 5, 2017
Elasticsearch Scroll Api Elasticsearch language-clients	1	223	December 8, 2022
Scrolling through entire result set is taking too much time Elasticsearch	1	369	July 6, 2017

SearchScroll hangs when dealing with a big amount of hits

Related topics