Sorting and range filtering semantic versions

I am trying to figure out some sort of indexing scheme where I can do range
filters on semantic versions http://semver.org/. Values look like these:

"1.0.2.5", "1.10.2.5", "2.3.434.1"

I know that I can add a separate field with the numbers padded out, but I
was hoping to have a single field where I could do things like this:

"version:>1.0" "version:1.0.2.5" "version:1.0" "version:[1.0 TO 2.0]"

I have created some pattern capture filters to allow querying partial
version numbers. I even created some pattern replacement filters to pad the
values out so that they could be lexicographically sorted, but those
filters only control the tokens that are indexed and not the value that is
used for sorting and range filters.

Is there a way to customize the value that is used for sorting and range
filters? It seems like it just uses the original value and I don't have
any control of it?

Any help would be greatly appreciated!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a3535da-76d8-4dff-b2e6-114ea83cd639%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Range and sort are two different challenges.

There is one solution for sort using a custom analyzer. You can create
binary sort keys for natural sort.

Use

and try this example

PUT /test/
{
"index": {
"analysis": {
"analyzer": {
"naturalsort": {
"tokenizer": "keyword",
"filter": "naturalsort"
}
}
}
}
}

POST /test/docs/_mapping
{
"properties" : {
"version": {
"type": "string",
"analyzer": "naturalsort"
}
}
}

PUT /test/docs/1
{
"version" : "1.0.2.5"
}

PUT /test/docs/2
{
"version" : "1.10.2.5"
}

PUT /test/docs/3
{
"version" : "2.3.434.1"
}

GET /test/_refresh

POST /test/docs/_search
{
"query" : {
"match_all" : {
}
},
"sort" : {
"version" : { "order" : "desc" }
}
}

Result is "2.3.434.1", "1.10.2.5", "1.0.2.5"

Term range query does not work with a custom analyzer only. The ES field
mapper would have to be extended by a new field type introducing a term
comparator for the special sort keys.

Jörg

On Mon, Jan 26, 2015 at 8:47 AM, Eric Smith eric@codesmithtools.com wrote:

I am trying to figure out some sort of indexing scheme where I can do
range filters on semantic versions http://semver.org/. Values look
like these:

"1.0.2.5", "1.10.2.5", "2.3.434.1"

I know that I can add a separate field with the numbers padded out, but I
was hoping to have a single field where I could do things like this:

"version:>1.0" "version:1.0.2.5" "version:1.0" "version:[1.0 TO 2.0]"

I have created some pattern capture filters to allow querying partial
version numbers. I even created some pattern replacement filters to pad the
values out so that they could be lexicographically sorted, but those
filters only control the tokens that are indexed and not the value that is
used for sorting and range filters.

Is there a way to customize the value that is used for sorting and range
filters? It seems like it just uses the original value and I don't have
any control of it?

Any help would be greatly appreciated!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6a3535da-76d8-4dff-b2e6-114ea83cd639%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/6a3535da-76d8-4dff-b2e6-114ea83cd639%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE-uf7zf7PREqe3J6-crQ7t%3DJON300Bm5XAUgDN4JrjZg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Did you ever find a good solution for this? I am trying to solve the same
problem (just sorting, not range filtering).

On Monday, January 26, 2015 at 2:47:30 AM UTC-5, Eric Smith wrote:

I am trying to figure out some sort of indexing scheme where I can do
range filters on semantic versions http://semver.org/. Values look
like these:

"1.0.2.5", "1.10.2.5", "2.3.434.1"

I know that I can add a separate field with the numbers padded out, but I
was hoping to have a single field where I could do things like this:

"version:>1.0" "version:1.0.2.5" "version:1.0" "version:[1.0 TO 2.0]"

I have created some pattern capture filters to allow querying partial
version numbers. I even created some pattern replacement filters to pad the
values out so that they could be lexicographically sorted, but those
filters only control the tokens that are indexed and not the value that is
used for sorting and range filters.

Is there a way to customize the value that is used for sorting and range
filters? It seems like it just uses the original value and I don't have
any control of it?

Any help would be greatly appreciated!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2a80f6c9-ae8e-4df9-a1df-30e3eda6697f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.