Filter with millions of record

(Deepak Chaudhary) #1

Can someone explain term filter or any other filter work with millions or 100 K records in the filter? How will be performed and how it will work?

What is the maximum query length in the request body for the rest API of elasticsearch
(Deepak Chaudhary) #2


(Christian Dahlqvist) #3

My guess is that it will be very slow and perform badly, but I would recommend benchmarking to find out what the limit is for your setup.

(Deepak Chaudhary) #4

We have term conditions where we are passing 100 K, 200 K records and performs is very bad and most of the time it's not workign.

How we can change our approcah to make it workign?

(Christian Dahlqvist) #5

That does not surprise me. What is your use case? Why are you sending so many terms? What are you trying to achieve? Which version of Elasticsearch are you on?

I suspect you will need to change how you work with Elasticsearch as I can not see any tuning that will make the current approach perform well.

(Puroo Jain) #6

Hi Christian,

Currently, we r using 2.4.0 but we also switch into 6.2.4.
Basically, we show charts and tables for which we need data from many indexes, so first we find out data from one index in the form of ids and then pass in query to find out result from another index.

(David Pilato) #8

So you are doing "joins" basically at search time...
There is no miracle to expect here.

If you can do it, do the join at index time and you'll pay the price only once.

My 2 cents

(Puroo Jain) #9

we are working on runtime approach..its change everytime whenever u load during indexing time how we save data?

(Puroo Jain) #10


(Puroo Jain) #11

David, any suggestion to which we perform not badly?

(Deepak Chaudhary) #12

@dadoonet @Christian_Dahlqvist Is that possible for you guys to do a small session with us and we can show you what we are doign and by seeing that suggest us some solution?

(Deepak Chaudhary) #13

Davind, we are not doing join, we pull some ids from one index one the bases of some condition than passing these retrive ids in another index in term.

(Christian Dahlqvist) #14

I do not do private sessions. Please post the relevant information here so the rest of the community can benefit.

If you are not comfortable sharing details in public, Elastic offers support subscriptions as well as consulting.

(David Pilato) #15

This is exactly what a join is IMO.

(Christian Dahlqvist) #16

If you can explain your data model and how you are updating and querying it, we might be able to provide some suggestions. It would also be useful if you could provide some guidance on data volumes and cardinality between entities modelled.

(Puroo Jain) #17


(Puroo Jain) #18

Yes Please

(Christian Dahlqvist) #19

What does your documents look like? How are you storing and linking the IDs?

(Puroo Jain) #20


(Puroo Jain) #21

Charts are working on the basis of Ids..

Main problem is if we fetch detials from 4 to 5 lakh data without any terms filter, then query take too much time..
if also use terms filters for ids then its also take time.
Suggest some :neutral_face: