I have been trying to use Elasticsearch for some simple filtering for example for an index with 100K records, return all matching a criteria (using as
When the number of records matching goes above 1K, CPU completely goes bonkers and search becomes very expensive. I do not return any field data, only the ID is enough for me yet it is still expensive.
I cannot use pagination since I need all such data to further process and re-order.
Doing something similar for example in SQL is trivial. I know there is no comparison but would like to understand if I am doing something completely wrong or this is essentially a limitation in Lucene-based indices.
Document size: ~ 4-20KB
Cluster: 3x beefy machines with 8 cores and 56GB RAM and striped SSDs.