I am currently using the Pattern Capture Group Token Filter inside an Elasticsearch plugin. Currently, the plugin is working absolutely fine and is emitting the tokens also. But, the issue is that the pattern which I am using inside the Pattern Capture Group Token Filter, it is returning only the first match and not continuing the matching after it.
My Pattern Capture Token Filter definition inside the TokenFilterFactory file looks like the below one:
PatternCaptureGroupTokenFilter(tokenStream, true, Pattern.compile("(?<![\\p{Alnum}])(\\p{Alnum}+\\p{Punct}\\p{Alnum}+)"))
The pattern that I want to test is "(?<![\p{Alnum}])(\p{Alnum}+\p{Punct}\p{Alnum}+)".I want to test it on the string: ":port&hid$query-input" And, I am expecting the following tokens as output. port&hid, hid$query, query-input