Problem statement:
We've millions of ids stored in another Id store. These ids we're generating from the application & will be as same es doc id. Here, for certain use-cases, we want to include & exclude these ids in a search query. Looked into the "ids" query not sure about the limitation there. Also, is there a plugin available that makes Elasticsearch interacts with another data store directly, fetch the ids during query execution?
Large lists of ids will always be a challenge
You may be interested in a script and some benchmarking here that shows speed-ups under certain conditions
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.