ICU has a MeasureFormat http://icu-project.org/apiref/icu4j/com/ibm/icu/text/MeasureFormat.html
with a parse method.
This can work with http://icu-project.org/apiref/icu4j/com/ibm/icu/util/MeasureUnit.html#MEGABYTE and http://icu-project.org/apiref/icu4j/com/ibm/icu/util/MeasureUnit.html#GIGABYTE and much more.
I can add this as an analyzer / token filter to my ICU plugin at https://github.com/jprante/elasticsearch-icu