In the short term you could add an extra set processor after the lang_ident inference processor that changes the predicted language field to en if the source field is an empty string. This set processor would use an if so that it only overrides the prediction for empty strings.
Nice that you may add it as a feature. I'm not sure if this is the best way or whether it's idiomatic, but I managed to accomplish it with this code as per David's instructions:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.