I have a question on how sorting during queries works in elasticsearch.
I have an index with a custom date format field, on which the sort is applied.
When quering the index for a given keywork, results are provided with the given
sort.
However, I've observed that some documents are not present in the result set. I
would have expected these results to be part of the result set as it would be
in relational systems using the SQL ORDER BY statement. I've verified that
these missing documents are covered by the query using the explain api.
According to the documentation, score computation ist not performed when using
sorts on fields.
Maybe someone can provide more information on how sorting is done?
I am using Elasticsearch 1.0.0RC1 on debian whezzy with openjdk7-jdk.
By default, the field cache size is unbounded, and does not expire. For
sort, it means that each field to sort is examined, all values of the field
are loaded, so the in-memory sorting can take place. It's exactly the same
what Lucene is executing.
With the default settings of the field cache, sort is working alright
(unless the field values will exceed the available memory)
Maybe you can set up an example of your sort as a demo, so that the error
can be reproduced?
Maybe you can set up an example of your sort as a demo, so that the error
can be reproduced?
It turned out that this behaviour was caused by me, since documents contained
the wrong timestamp on the sorted field. After fixing this, the results were as
expected.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.