I currently have an analyzer using the regular expression \\W|_
. This splits by non-word chars and underscore. How will this handle other special characters from other languages?
Furthermore, is there a way to configure the regular expression in the analyzer to split every x characters? If the criteria is not met to split the expression, I'd like to split it after x characters regardless of what character it is. Is this feasible?