I have a field called
hdr_subject and I would like to create a new field such as
hdr_subject_ngram, which I would like to be an array of all word N-grams of the field
hdr_subject, up to certain size (e.g. 4).
For example, if
"hdr_subject" : "Discount: RX Viagra Pills"
then I would like
"hdr_subject_ngram" : [ "discount", "rx", "viagra", "pills", "discount rx", "rx viagra", "viagra pills", "discount rx viagra", "rx viagra pills", "discount rx viagra pills" ]
This N-gram field will be indexed as a keyword, and used as an influencer for a Machine Learning job.
So far, I found https://stackoverflow.com/questions/27387231 to be of help, but I'm not sure how to create a new field out of the analyzer. Do you have any pointers?