How to do custom sorting


(Iti) #1

Hi,
I want to do custom sorting for some of the fields. I am having some issues with it.

In my elasticsearch plugin, I had registered a CustomSortFieldBuilder, which builds a CustomSortFieldComparatorSource. CustomSortFieldComparatorSource is in charge of instantiating CustomSortFieldComparator that does the sorting based on our needs.

The elasticsearch plugin registers the builder this way:

@Override
public List<NamedWriteableRegistry.Entry> getNamedWriteables() {
    return Collections.singletonList(
            new NamedWriteableRegistry.Entry(SortBuilder.class, CustomSortFieldBuilder.NAME, CustomSortFieldBuilder::new)
    );

}

In the client code, my requests then uses the customSortFieldBuilder.

new SearchRequest(TEST_OBJECT_INDEX).source().query(new MatchAllQueryBuilder()).sort(new CustomSortFieldBuilder("name.keyword"));

Requests are being sent fine. The problem occurs when the results in the response object are merged and sorted from multiple nodes.

During marshalling and unmarshalling of the response object, we are losing information about the comparator source.

org.elasticsearch.common.lucene.Lucene.writeTopDocs(StreamOutput out, TopDocs topDocs)

 if (sortField.getComparatorSource() != null) {
                    IndexFieldData.XFieldComparatorSource comparatorSource = (IndexFieldData.XFieldComparatorSource) sortField.getComparatorSource();
                    writeSortType(out, comparatorSource.reducedType());
                    writeMissingValue(out, comparatorSource.missingValue(sortField.getReverse()));
                } 

Notice that we are not even writing to stream which comparatorsource to use when marshalling the response object. So, on unmarshalling of the response object, Elasticsearch no longer knows to use the customComparator. And the documents are merged and sorted using one of the default comparators.

Any idea what I am missing here?


(Nik Everett) #2

From what I can see sorts aren't a "real" extension point. getNamedWriteables is useful if your plugin needs to make totally new kinds of NamedWriteable but isn't a general purpose extension point. In particular when you extend something like sorts you need to make sure the NamedWriteable is registered and the parsing works. Right now parsing of sorts is entirely static, starting at SortBuilder.fromXContent. So you can't extend it.

I'm in the process of working on a cleanup that should make it easier to support custom sorts. But it won't make it until at least 5.2, maybe 5.3 depending on how much work I have to do before I can get it in. And that doesn't count any work to make sorts actually extensible. So it might make sense to talk about why you need a custom sort. Maybe it makes sense to use script based sorting instead?


(Iti) #3

Hi Nik,
Thanks for responding back.
We are looking to sort in natural order. We were using Lucene before switching over to Elasticsearch 5.0. In Lucene, we were simply providing our own comparator to Lucene to do the natural sorting. I wanted to apply the same concept in Elasticsearch.

I could use script based sorting. However, could you please tell me the performance impact of script sorting? Another way is to index another field just for sorting purposes.

Another case is to do case insensitive sorting. What's the performance impact of running the following script sorting?

GET /my_index/user/_search
{
"sort" : {
"_script" : {
"type" : "string",
"script" : {
"lang": "groovy",
"inline": "doc['name.keyword'].toString().toLowerCase()",
"params" : {
"factor" : 1.1
}
},
"order" : "asc"
}
}
}


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.