I have a property called "version" that will have semantic version strings
like "1.1.0", "10.0.1", and "2.0.0". There will always be three integers
separated by periods and no other characters. I need to be able to sort my
docs by version and do bool comparisons. For example, using the values
above I need the ascending order to be "1.1.0", "2.0.0", "10.0.1".
My approach so far, documented in this gist https://gist.github.com/jpotts/375e3c8894f93648995d, is to use a groovy
script to normalize the string at index time using a transform. The script
pads each element in the string with spaces to a length of eight and stores
the result in a field (version.sortable). So "1.1.0" would become "
1 1 0" while "10.0.1" would become " 10 0 1".
As I put in the gist, this works, but I suspect there is a better way to
handle this. One drawback of my current approach is that queries have to
use the normalized term. I'd much rather use the natural format and have
the normalization be invisible to the front-end.
I suspect I could handle this with a custom analyzer but I'm not exactly
sure where to start with that.
I have a property called "version" that will have semantic version strings
like "1.1.0", "10.0.1", and "2.0.0". There will always be three integers
separated by periods and no other characters. I need to be able to sort my
docs by version and do bool comparisons. For example, using the values
above I need the ascending order to be "1.1.0", "2.0.0", "10.0.1".
My approach so far, documented in this gist https://gist.github.com/jpotts/375e3c8894f93648995d, is to use a groovy
script to normalize the string at index time using a transform. The script
pads each element in the string with spaces to a length of eight and stores
the result in a field (version.sortable). So "1.1.0" would become "
1 1 0" while "10.0.1" would become " 10 0 1".
As I put in the gist, this works, but I suspect there is a better way to
handle this. One drawback of my current approach is that queries have to
use the normalized term. I'd much rather use the natural format and have
the normalization be invisible to the front-end.
I suspect I could handle this with a custom analyzer but I'm not exactly
sure where to start with that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.