How to script sort on a not_analyzed field containing an array in ES 0.20.6

My attempts to use a script to sort on a field containing an array of
not_analyzed strings does not behave as I'd expect. Elasticsearch sorts on
the highest-sorted value in the array, rather than the array values in
their original order. Is that expected behavior?

The _mapping for my title field. I don't think the multi_field aspect is
relevant, but I'm leaving it in for completeness.

            "title" : {
              "type" : "multi_field",
              "fields" : {
                "title" : {
                  "type" : "string"
                },
                "raw" : {
                  "type" : "string",
                  "index" : "not_analyzed",
                  "omit_norms" : true,
                  "index_options" : "docs",
                  "include_in_all" : false
                }
              }

My sort script that just serializes all the array values into one string:

'script' => "s='';foreach(val : doc['title.raw'].values) {s += val + ' '} s"
,

Example source data:

{
    "_id": "1",
    "title": ["Y", "B"]
},
{
    "_id": "2",
    "title": ["Z", "A"]
}

With that source data and that script sort, I would expect ES to return doc
1 first because Y comes before Z. However, it returns doc 2 first because A
comes before B. (I'm guessing ES is storing the title array values already
sorted in ascending order.)

Is there any way to make this approach work? If not, I'm thinking about
using the river (which is how ES gets all its data in our architecture)
script to populate a new field with the first element of the title array.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I would use a new field and place the burden of constructing a sortable
string at index time. Your index will grow bigger, but query time
performance will be improved.

--
Ivan

On Wed, Jul 3, 2013 at 5:23 PM, Brian Gadoury bgadoury@endpoint.com wrote:

My attempts to use a script to sort on a field containing an array of
not_analyzed strings does not behave as I'd expect. Elasticsearch sorts on
the highest-sorted value in the array, rather than the array values in
their original order. Is that expected behavior?

The _mapping for my title field. I don't think the multi_field aspect is
relevant, but I'm leaving it in for completeness.

            "title" : {
              "type" : "multi_field",
              "fields" : {
                "title" : {
                  "type" : "string"
                },
                "raw" : {
                  "type" : "string",
                  "index" : "not_analyzed",
                  "omit_norms" : true,
                  "index_options" : "docs",
                  "include_in_all" : false
                }
              }

My sort script that just serializes all the array values into one string:

'script' => "s='';foreach(val : doc['title.raw'].values) {s += val + ' '}
s",

Example source data:

{
    "_id": "1",
    "title": ["Y", "B"]
},
{
    "_id": "2",
    "title": ["Z", "A"]
}

With that source data and that script sort, I would expect ES to return
doc 1 first because Y comes before Z. However, it returns doc 2 first
because A comes before B. (I'm guessing ES is storing the title array
values already sorted in ascending order.)

Is there any way to make this approach work? If not, I'm thinking about
using the river (which is how ES gets all its data in our architecture)
script to populate a new field with the first element of the title array.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.