Adding custom token filter in Elastic Search for retrieving skip-grams

I searched online and found that shingles tokenizer that comes
pre-installed with elastic search can give bigrams, trigrams etc.

I want to retrieve skip-grams from my documents for indexing, along with
words, bigrams and trigrams.

Further search revealed that I might have to write a custom plugin for such
tokenizer. But I could not find proper documentation for writing one. Can
anyone point me to the right resources which I might need for the task.


You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit
For more options, visit