Convert strings with different data units (MB,GB,TB) to byte

ICU has a MeasureFormat http://icu-project.org/apiref/icu4j/com/ibm/icu/text/MeasureFormat.html
with a parse method.

This can work with http://icu-project.org/apiref/icu4j/com/ibm/icu/util/MeasureUnit.html#MEGABYTE and http://icu-project.org/apiref/icu4j/com/ibm/icu/util/MeasureUnit.html#GIGABYTE and much more.

I can add this as an analyzer / token filter to my ICU plugin at https://github.com/jprante/elasticsearch-icu

1 Like