I need to query docs, having results sorted by a date field.
Before using Elastic Search, I cast my date (long of millis since epoc) to a float and then sorted on that. I did the cast because I read that Lucene could not sort on a long. There was a loss of precision, obviously, and if the docs were too close in time, the ordering might be off, which we tolerated. When we switched to Elastic Search, we did the same thing, doing a script on the timestamp saved as a float; again it worked with tolerable some loss of precision.
Now I want to improve the precision. In a different thread you advised me to to save the date as a long and the do a script on that. I find it also generally works but also lacks precision when the dates get too close (generally under 3 minutes or so apart). This is what I did (sortField referenced a long):
response = indexClient.search(Requests.searchRequest(getIndexName()).types(documentTypes).searchType(SearchType.QUERY_THEN_FETCH).source(SearchSourceBuilder.searchSource().query(QueryBuilders.customScoreQuery(QueryBuilders.queryString(query)).script("doc['" + sortField + "'].value")).fields(fields).from(offset).size(max).explain(true))).actionGet();
I also tried storing the field as a string and doing a sort on that. It worked and the precision was better, but I could not get the sort order param to work -- I get the same results whether I user SortOrder.ASC or DESC. This is what I did (sortField referenced a string):
response = indexClient.search(Requests.searchRequest(getIndexName()).types(documentTypes).searchType(SearchType.DFS_QUERY_THEN_FETCH).source(SearchSourceBuilder.searchSource().query(QueryBuilders.queryString(query)).fields(fields).from(offset).size(max).explain(true).sort(sortField, SortOrder.DESC))).actionGet();
-
Is loss of precision doing a script on a long expected? Is there anything I can do to improve precision?
-
If I wind up doing a sort on a string, how can I get the sort param to work?
-
I understand a sort is slower than a script; how much worse is this expected to be?
BTW #1: I know a script will change the score and a sort will not. I'm not too worried about that.
BTW #2: I need to use the from/size params for pagination; not sure if that impacts this decision.
Thanks.