Sorting strings that contain numbers

Hi guys. When sorting on a field that's a string, strings that contain
numbers aren't sorted properly.

For example, with these documents:
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }

When ES sorts on the "name" field, the documents are returned in this order:
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }

How can we get ES to return the documents in the following order?
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }

Thanks,
Nick

--

IMHO you should index docs with 02 instead of 2.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 28 oct. 2012 à 17:48, Nick Hoffman nick@deadorange.com a écrit :

Hi guys. When sorting on a field that's a string, strings that contain numbers aren't sorted properly.

For example, with these documents:
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }

When ES sorts on the "name" field, the documents are returned in this order:
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }

How can we get ES to return the documents in the following order?
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }

Thanks,
Nick

--

Hi David. Prefixing numbers with zeros won't work because that assumes that
there's a constant number of digits in the number.

On Sunday, 28 October 2012 15:01:37 UTC-4, David Pilato wrote:

IMHO you should index docs with 02 instead of 2.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 28 oct. 2012 à 17:48, Nick Hoffman <ni...@deadorange.com <javascript:>>
a écrit :

Hi guys. When sorting on a field that's a string, strings that contain
numbers aren't sorted properly.

For example, with these documents:
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }

When ES sorts on the "name" field, the documents are returned in this
order:
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }

How can we get ES to return the documents in the following order?
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }

Thanks,
Nick

--

--

Related
http://elasticsearch-users.115913.n3.nabble.com/Sorting-a-string-field-numerically-td4024557.html#a4024561

On Sunday, October 28, 2012 12:10:52 PM UTC-7, Nick Hoffman wrote:

Hi David. Prefixing numbers with zeros won't work because that assumes
that there's a constant number of digits in the number.

On Sunday, 28 October 2012 15:01:37 UTC-4, David Pilato wrote:

IMHO you should index docs with 02 instead of 2.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 28 oct. 2012 à 17:48, Nick Hoffman ni...@deadorange.com a écrit :

Hi guys. When sorting on a field that's a string, strings that contain
numbers aren't sorted properly.

For example, with these documents:
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }

When ES sorts on the "name" field, the documents are returned in this
order:
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }

How can we get ES to return the documents in the following order?
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }

Thanks,
Nick

--

--

Could you split the data into multiple fields? So have a name field "Bob,
Anne, etc" which is a string and a points field "3, 10, 2" which is a
number. Then sort both fields together, name coming first?

On Monday, October 29, 2012 5:48:12 AM UTC+13, Nick Hoffman wrote:

Hi guys. When sorting on a field that's a string, strings that contain
numbers aren't sorted properly.

For example, with these documents:
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }

When ES sorts on the "name" field, the documents are returned in this
order:
{ name: "Bob: 10 points" }
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }

How can we get ES to return the documents in the following order?
{ name: "Bob: 2 points" }
{ name: "Bob: 3 points" }
{ name: "Bob: 10 points" }

Thanks,
Nick

--

I'd definitely do that if I could, Chris. The strings that I'm indexing are
names of objects that can't be split, unfortunately. E.g.

Megatron (UN-04)
The Amazing Spider-Man #44
G2 Optimus Prime

This is why the sorting has to happen within ES.

On Sunday, 28 October 2012 22:40:02 UTC-4, Chris Male wrote:

Could you split the data into multiple fields? So have a name field "Bob,
Anne, etc" which is a string and a points field "3, 10, 2" which is a
number. Then sort both fields together, name coming first?

--

Hi Nick,

thanks for your inspiration. It's a great idea. I just hacked together a
plugin that can perform the desired sort by using a natural sort key in a
Lucene token filter.

The README is a little short but I hope it helps.

See also the test
file https://github.com/jprante/elasticsearch-analysis-naturalsort/blob/master/src/test/java/org/elasticsearch/index/analysis/naturalsort/NaturalSortKeyTests.java

As a bonus, a collator key sort is included (since the natural sort key
extends the collator key, you have to add a "locale" parameter to the token
filter if you want locale-sensitive sort)

Cheers,

Jörg

On Monday, October 29, 2012 3:54:01 AM UTC+1, Nick Hoffman wrote:

I'd definitely do that if I could, Chris. The strings that I'm indexing
are names of objects that can't be split, unfortunately. E.g.

Megatron (UN-04)
The Amazing Spider-Man #44
G2 Optimus Prime

This is why the sorting has to happen within ES.

On Sunday, 28 October 2012 22:40:02 UTC-4, Chris Male wrote:

Could you split the data into multiple fields? So have a name field "Bob,
Anne, etc" which is a string and a points field "3, 10, 2" which is a
number. Then sort both fields together, name coming first?

--

Jörg,

Will your plugin work with ES version 1.3.4?

Yes, there is a version for ES 1.3.4

Jörg

On Fri, Jan 23, 2015 at 7:13 PM, bbehling brian.behling@gmail.com wrote:

Jörg,

Will your plugin work with ES version 1.3.4?

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Sorting-strings-that-contain-numbers-tp4024553p4069501.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1422036835206-4069501.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHgRyFRgTMnv61ukyWNfEw4hKX3UHHFdEg7n0upqKLJBQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi,
Are you planning to have a version for ES for ES 5.0?