I currently have an analyzer using the regular expression
\\W|_. This splits by non-word chars and underscore. How will this handle other special characters from other languages?
Furthermore, is there a way to configure the regular expression in the analyzer to split every x characters? If the criteria is not met to split the expression, I'd like to split it after x characters regardless of what character it is. Is this feasible?