Filter with millions of record

Can someone explain term filter or any other filter work with millions or 100 K records in the filter? How will be performed and how it will work?

Anyone?

My guess is that it will be very slow and perform badly, but I would recommend benchmarking to find out what the limit is for your setup.

We have term conditions where we are passing 100 K, 200 K records and performs is very bad and most of the time it's not workign.

How we can change our approcah to make it workign?

That does not surprise me. What is your use case? Why are you sending so many terms? What are you trying to achieve? Which version of Elasticsearch are you on?

I suspect you will need to change how you work with Elasticsearch as I can not see any tuning that will make the current approach perform well.

Hi Christian,

Currently, we r using 2.4.0 but we also switch into 6.2.4.
Basically, we show charts and tables for which we need data from many indexes, so first we find out data from one index in the form of ids and then pass in query to find out result from another index.

So you are doing "joins" basically at search time...
There is no miracle to expect here.

If you can do it, do the join at index time and you'll pay the price only once.

My 2 cents

2 Likes

we are working on runtime approach..its change everytime whenever u load again..so during indexing time how we save data?

abcd

David, any suggestion to which we perform not badly?

@dadoonet @Christian_Dahlqvist Is that possible for you guys to do a small session with us and we can show you what we are doign and by seeing that suggest us some solution?

Davind, we are not doing join, we pull some ids from one index one the bases of some condition than passing these retrive ids in another index in term.

I do not do private sessions. Please post the relevant information here so the rest of the community can benefit.

If you are not comfortable sharing details in public, Elastic offers support subscriptions as well as consulting.

This is exactly what a join is IMO.

1 Like

If you can explain your data model and how you are updating and querying it, we might be able to provide some suggestions. It would also be useful if you could provide some guidance on data volumes and cardinality between entities modelled.

abcd

Yes Please

What does your documents look like? How are you storing and linking the IDs?

abcd

Charts are working on the basis of Ids..

Main problem is if we fetch detials from 4 to 5 lakh data without any terms filter, then query take too much time..
if also use terms filters for ids then its also take time.
Suggest some :neutral_face: