No I did not say that. Or I did not mean that. Sorry if it was unclear.
I said: don’t use large sizes:
Never use size:10000000 or from:10000000.
You should read this: Elasticsearch Platform — Find real-time answers at scale | Elastic http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs
Le 10 déc. 2014 à 21:16, Ron Sher ron.sher@gmail.com a écrit :
So you're saying there's no impact on elasticsearch if I issue a large size?
If that's the case then why shouldn't I just call size of 1M if I want to make sure I get everything?On Wednesday, December 10, 2014 8:22:47 PM UTC+2, David Pilato wrote:
Scan/scroll is the best option to extract a huge amount of data.
Never use size:10000000 or from:10000000.It's not realtime because you basically scroll over a given set of segments and all new changes that will come in new segments won't be taken into account during the scroll.
Which is good because you won't get inconsistent results.About size, I'd would try and test. It depends on your docs size I believe.
Try with 10000 and see how it goes when you increase it. You will may be discover that getting 1010000 docs is the same as 1100000.Best
David
Le 10 déc. 2014 à 19:09, Ron Sher <ron....@gmail.com <javascript:>> a écrit :
Hi,
I was wondering about best practices to to get all data according to some filters.
The options as I see them are:
Use a very big size that will return all accounts, i.e. use some value like 1m to make sure I get everything back (even if I need just a few hundreds or tens of documents). This is the quickest way, development wise.
Use paging - using size and from. This requires looping over the result and the performance gets worse as we advance to later pages. Also, we need to use preference if we want to get consistent results over the pages. Also, it's not clear what's the recommended size for each page.
Use scan/scroll - this gives consistent paging but also has several drawbacks: If I use search_type=scan then it can't be sorted; using scan/scroll is (maybe) less performant than paging (the documentation says it's not for realtime use); again not clear which size is recommended.
So you see - many options and not clear which path to take.What do you think?
Thanks,
Ron--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/764a37c5-1fec-48c4-9c66-7835d8141713%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/764a37c5-1fec-48c4-9c66-7835d8141713%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/838020dc-d2ea-423d-9606-778d807b1a0d%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/838020dc-d2ea-423d-9606-778d807b1a0d%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/D2511659-9029-41CB-89B5-CC5E363B656B%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.