Yup. It's calling my tokenizer. But now it's revealed that my tokenizer is in fact crap!
Caused by: java.lang.IndexOutOfBoundsException
	at org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl.append(CharTermAttributeImpl.java:131)
	at com.cameraforensics.elasticsearch.plugins.UrlTokenizer.incrementToken(UrlTokenizer.java:30)
Probably because - as there are no docs - I'm doing it wrong.
    @Override
    public boolean incrementToken() throws IOException {
        if (position >= tokens.size()) {
            return false;
        } else {
            termAtt.setEmpty().append(tokens.get(position), position, position);
            position++;
            return true;
        }
    }
tokens is a list of all permutations of index segmentation (as per this: Performance of doc_values field vs analysed field)
I'm not really sure what the two int values should be on CharTermAttribute#append, so I'm guessing - incorrectly.
Anyway, thanks for all of your help. I'll keep hacking!