The absolute time taken depends on the cluster resources of course. At my
laptop, for 1000 docs of ~1k size in average, a scroll response 'took'
field shows usually ~200-500ms. It takes additional time to process the
response hits.
I am not sure if the number of shards is relevant. There are more important
factors: shard numbers per node, shard size, buffers and heap memory,
network compression, network speed, node workload...
If you are interested in a Java scan/scroll example, you can peek into the
knapsack plugin source
Critical for a scalable scan/scroll is a reasonable timeout. In the
knapsack plugin, I use a default of 30 seconds.
In the ES docs, a timeout of 10 minutes is used
which seems not very helpful, as this will pressure your heap in almost all
cases of long-lasting scan/scroll...
Jörg
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGhWWJf%3DdvxsBBEc%3DzoNfGsqLofTfOv4J4CmXbGJACg-w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.