I am planning to use Elasticsearch for further filtering and boosting of recommendations generated by a system.
So if I pass a list of ids and use Terms query to an index containing documents with their ids in order to select metadata from Elasticsearch (with no extra parameters), would the result be in the original order of the Ids passed in the Terms query?
Obviously in the next stage I would like to use other queries to filter (and possibly boost) items,
Almost certainly not. For many reasons. One clear one to me would be as updates stream through a distributed, replicated system, some shards might for whatever reason get documents committed faster than others, with replicas within shards might also working differently. Elasticsearch shouldn't really be thought of as a distributed, consistent transactional system. So when you stream stuff back it likely will be out of order depending on which replica gets the search. Under the hood further, on disk, segments are being merged and collated. There may be read-time optimizations that cause one segment on one replica to be favored over another segment at search time, streaming its contents back "first."
Why do you need data in id order? Why not add a timestamp and sort on that timestamp?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.