Querying ElasticSearch to order empty strings last

I am using Django, Haystack, and ElasticSearch. I want to order my search
results so that results where the ordered field value is empty ("") come
after results where it is not empty. I cannot find an API in Haystack that
can do this. The request sent to ElasticSearch looks like:

{
   "sort":[
      {
         "version":{
            "order":"asc"
         }
      }
   ],
   "query":{
      ...
   }
}

Is there a way to rewrite this ElasticSearch query so that results with an
empty string for "version" will come after results where "version" exists?

I have implemented this in Python as:

sorted(sqs, key=lambda x: getattr(x, 'version') == '')

--

Hello Kevin,

Here's something that should work, but it's going to be slow:

"sort": {
    "_script": {
        "script": "if (_source.version == \"\") { \"zzz\" } else {

_source.version }",
"type": "string",
"order": "asc"
}
}

If you want it fast, I think you should look at indexing your versions as
numbers. Then you can use "missing" to put docs without a "version" field
either at the beginning or at the end:

In case your versioning format can't be easily indexed as numbers (because
of multiple dots), you can index parts in different fields. And then you
can sort on all of them by specifying an array there.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Sat, Dec 15, 2012 at 10:43 PM, Kevin Tran hekevintran@gmail.com wrote:

I am using Django, Haystack, and Elasticsearch. I want to order my search
results so that results where the ordered field value is empty ("") come
after results where it is not empty. I cannot find an API in Haystack that
can do this. The request sent to Elasticsearch looks like:

{
   "sort":[
      {
         "version":{
            "order":"asc"
         }
      }
   ],
   "query":{
      ...
   }
}

Is there a way to rewrite this Elasticsearch query so that results with an
empty string for "version" will come after results where "version" exists?

I have implemented this in Python as:

sorted(sqs, key=lambda x: getattr(x, 'version') == '')

--

--

Hi Kevin,

Here is a faster version of the query:

{
"query": {
"custom_filters_score" : {
"query" : {
"constant_score": {
"query": {
.... your original query ....

                }
            }
        },
        "filters" : [
            {
                "filter" : { "missing" : { "field" :  "version"} },
                "boost" : "2"
            }
        ]
    }        
},
"sort": [
    {
        "_score": {"order":"asc"}
    },
    {
        "version": {"order":"asc"}
    }
]

}

It basically assigns _score of 1.0 to all records with non-empty version
and _score of 2.0 to all records with empty version. Then it sorts by
_score in ascending order and then by version in ascending order. As a
result, all records with empty version are pushed to the bottom of the
list.

Igor

On Monday, December 17, 2012 4:08:10 AM UTC-8, Radu Gheorghe wrote:

Hello Kevin,

Here's something that should work, but it's going to be slow:

"sort": {
    "_script": {
        "script": "if (_source.version == \"\") { \"zzz\" } else { 

_source.version }",
"type": "string",
"order": "asc"
}
}

If you want it fast, I think you should look at indexing your versions as
numbers. Then you can use "missing" to put docs without a "version" field
either at the beginning or at the end:
Elasticsearch Platform — Find real-time answers at scale | Elastic

In case your versioning format can't be easily indexed as numbers (because
of multiple dots), you can index parts in different fields. And then you
can sort on all of them by specifying an array there.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Sat, Dec 15, 2012 at 10:43 PM, Kevin Tran <hekev...@gmail.com<javascript:>

wrote:

I am using Django, Haystack, and Elasticsearch. I want to order my search
results so that results where the ordered field value is empty ("") come
after results where it is not empty. I cannot find an API in Haystack that
can do this. The request sent to Elasticsearch looks like:

{
   "sort":[
      {
         "version":{
            "order":"asc"
         }
      }
   ],
   "query":{
      ...
   }
}

Is there a way to rewrite this Elasticsearch query so that results with
an empty string for "version" will come after results where "version"
exists?

I have implemented this in Python as:

sorted(sqs, key=lambda x: getattr(x, 'version') == '')

--

--