Scrolling performance

jprante · January 19, 2014, 8:29pm

The absolute time taken depends on the cluster resources of course. At my
laptop, for 1000 docs of ~1k size in average, a scroll response 'took'
field shows usually ~200-500ms. It takes additional time to process the
response hits.

I am not sure if the number of shards is relevant. There are more important
factors: shard numbers per node, shard size, buffers and heap memory,
network compression, network speed, node workload...

If you are interested in a Java scan/scroll example, you can peek into the
knapsack plugin source

https://github.com/jprante/elasticsearch-knapsack/blob/master/src/main/java/org/xbib/elasticsearch/action/RestExportAction.java#L310

Critical for a scalable scan/scroll is a reasonable timeout. In the
knapsack plugin, I use a default of 30 seconds.

In the ES docs, a timeout of 10 minutes is used

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html

which seems not very helpful, as this will pressure your heap in almost all
cases of long-lasting scan/scroll...

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGhWWJf%3DdvxsBBEc%3DzoNfGsqLofTfOv4J4CmXbGJACg-w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Scan/Scroll performance degrading logarithmically Elasticsearch	4	1308	July 5, 2017
Scrolling performance Elasticsearch	5	1765	July 6, 2017
Slow scrolling speed Elasticsearch	9	7830	July 6, 2017
Scrolling is not intended for real time user requests - why? Elasticsearch	10	1774	July 6, 2017
High CPU during scroll large request, low iowait Elasticsearch	9	2041	July 6, 2017

Scrolling performance

Related topics