Issue with multiple instances of the same token

I am getting a token multiple times after the analysis process. I am using the pattern token filter and using different regexes with different instances of token filter on the same input string. In some cases, I'm getting the same token where the start and end offsets are also the same including the token itself and in some cases, the same token is occurring with different start and end offsets.

This behavior is absolutely correct as I have the same token occurring at multiple locations in my input string. But, the issue is that I only want one token with a particular start and end offset and not multiple occurrences of the same token having the same start and end offsets. The other occurrences of the same token but having different start and end offsets are absolutely fine.

I don't want to use the "unique" token filter as it will remove all the occurrences of the token.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.