Fetching 50,000 documents, not sorted

ziv2081 · October 2, 2015, 2:23pm

Hey all,

Sometimes we need to do client side joins as we do not want to denormalise all the data due to capacity and storage issues.

For what it seems, running a query to just match all and return all, gets around 50,000 documents per seconds.
This is all done on a single index with a single shard.
SSD with 80k IOPS for READ and 8 core cpu.

This is using the python api.

With MySQL for example, we can fetch 1M rows per second.
Is there any way to tweak the settings to improve elastic's fetch capabilities?
It doesn't seem like a network issue as it doesn't use much bandwidth.

warkolm · October 4, 2015, 2:11am

Why not use scan and scroll instead?

ziv2081 · October 7, 2015, 8:48am

This seemed a bit faster than the regular fetch. but not by much.
got around 80k per second.

Christian_Dahlqvist · October 7, 2015, 9:07am

Have you tried using a larger number of shards?

Topic		Replies	Views
How to get large response to query fast? Elasticsearch	2	861	August 31, 2017
Search Performance Elasticsearch	9	372	July 6, 2017
Retrieving over a million records in Elasticsearch Elasticsearch	10	29100	July 5, 2017
Query vs. fetch times Elasticsearch	4	9238	July 6, 2017
How to get data more than 10000 in elasticsearch Elasticsearch	27	21627	January 17, 2018

Fetching 50,000 documents, not sorted

Related topics