Fetching 50,000 documents, not sorted


(Ziv) #1

Hey all,

Sometimes we need to do client side joins as we do not want to denormalise all the data due to capacity and storage issues.

For what it seems, running a query to just match all and return all, gets around 50,000 documents per seconds.
This is all done on a single index with a single shard.
SSD with 80k IOPS for READ and 8 core cpu.

This is using the python api.

With MySQL for example, we can fetch 1M rows per second.
Is there any way to tweak the settings to improve elastic's fetch capabilities?
It doesn't seem like a network issue as it doesn't use much bandwidth.


(Mark Walkom) #2

Why not use scan and scroll instead?


(Ziv) #3

This seemed a bit faster than the regular fetch. but not by much.
got around 80k per second.


(Christian Dahlqvist) #4

Have you tried using a larger number of shards?


(system) #5