Efficient query with a big input filter set?


#1

Hello Elastic experts,

I have a simple index with 100M books and each book has an unique book ID (string). I have another 1M book IDs (the 1M book ID belongs to 100M book IDs), and my query is simple, I just want to retrieve the 1M books information stored in Elastic Search cluster by book IDs (kinds of filter the 100M books by the 1M book IDs).

My current concern is if I pass the 1M book ID in a single query, the query will be super huge JSON, will there be any problems? But to send 1M queries with each query a single book ID may be too much overhead of the # of queries? Any good solutions is appreciated.

thanks in advance,
Lin


#2

Hi Elastic Search experts,

If any one could have any good advice, it will be great.

regards,
Lin


(Mark Harwood) #3

You propose two options - 1 query with a million terms or a million queries with a single term. The preferable option is somewhere between these two extremes eg a thousand queries each with a thousand terms.


#4

Thanks Mark,

I think the former will always have better performance?

BTW, do you know if there is a limit about how big request JSON could be?

regards,
Lin


(system) #5