Superslow simple query


(Ted Smith) #1

Hi:

I am using termsFilter with a list of integer of 50000)
trying to get response
it takes like 3 minutes to get result back.
it is not sorted in anyway, I just want to get the list of docs

Another scenario, I simply want to read all docs back without any sorted
or intermediary processing. I tried using scrolling. it takes forever
to get response when there more more than 100000 rows

I'd expect the db to return result quickly since it has the initial data right
without need to arrange/pack/sort, Anyway to have a streaming mode)
or anyway to write a plugin to enable a streaming mode.
(I am not interested all those stats data in the response, I just want the values
streamed back right away.

Any help would be appreciated
Thanks


(Mark Harwood) #2

50,0000 random terms and doc fetches is at least 50,000 random disk seeks.
If you have spinning media let's assume a random disk seek takes 5ms.

50,000 * 5ms = 4minutes.

RAM will help with OS file system caching.
SSDs will help with faster disk seeks when you have a cache miss.


(Ted Smith) #3

Thanks for the info. Is there anyway (even write a plugin) to allow streaming doc back when it
reads one in? It is ok if it takes this much time for disk read, but I'd like to get the data read
immediately returned while it is reading the next. Right now, the client is just sitting there
waiting for 4 minutes before getting the first element, even though ES should get it in
the first 4 ms and send it back


(Mark Harwood) #4

For bulk reads see https://www.elastic.co/guide/en/elasticsearch/guide/current/scan-scroll.html


(system) #6