Regex pattern_replace

Hi friend, I have a problem. I don’t understand why it works like this?

#My filter pattern
'my_pattern_replace' => [
"type" => "pattern_replace",
"pattern" => "([0-9.,-]+)\s?(car\b|cars\b|cars\w+)",
"replacement" => "$1car"


#test against :

i have 2 cars and 2cars

#String replacement result analyze:

Hello and welcome to the forums!

Your example uses a pattern_replace, which is a token filter. It operates on individual tokens in the text, rather than on the string as a whole. By the time the string i have 2 cars and 2cars gets to the filter, it's been transformed into a stream of tokens: ['i', 'have', '2', 'cars', 'and', '2cars']. The regex gets applied to each of those tokens, and 2 and cars don't match on their own, but the 2cars token does.

If you wanted 2 cars to be its own token, that's something that would be controlled by a "tokenizer" rather than a "token filter." For more details, see anatomy of an analyzer.


I found this

it's work

but I had another problem, the snowball is used after char_filter so my regular expression does not always work. Are there any solutions?

There is a lot that you can do with custom analyzers, but I'm not sure if I understand your use case fully. What is the problem you are trying to solve with this regular expression?


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.