Keywords with spaces and root word stemming

Hi, I have a data set where you have keywords such as

R1 - Hiking
R2 - Mountain climbing
R3 - Hike mountain
R4 - Climbing rocks
R5 - Climbing

Now what I'm trying to do is, when user search for something like "climb" ES should match the record with climb key word only (R5). So the basic idea is I should get able to break the words in side the keyword down to it's root word and compare. So if user search for " mountain climb", "mountain climbing", "climb mountains" then ES only return
R2 record.

I can use English analyzer with stemming but then when user search above query it also returns unwanted results as well. For example.. if the users search for "climb" or "climbing" normally ES returns the R4 and R5 .. but I want ES to return is only R5...

Does anyone know how to accomplish this...


Maybe match phrase query would be worth exploring.

Hi, Thanks for the suggestion. But I couldn't find a way to solve my issue by using match phrase query.
I could use keywords but then I cant use stemming feature to compare root words. For an example with keywords, when user search for "climbing" ES will return R5.. But if user search for "climb" it wont return R5 as a result. Same way search query "climbing rocks" will return R4 as the result, but "climb rocks", "climb rocks" wont return any results.

I want to fix that issue and I'm kind of stuck at the moment.

Any suggestions?

What you're describing is not available out of the box in Elasticsearch. As far as I understand your problem, the analysis chain first needs to tokenize, then stem, then finally put all the tokens back together into one token. This seems to be a rare use case which you would need to write your own TokeFilter for. After a quick search I found a Concatenate Plugin which might be doing what you want, however it doesn't seem to be very actively supported at the moment.

Thanks for the reply and the suggestion. Concatenate token filter.. good idea. Let me do some research on that.

Excellent suggestion... this worked like a charm.

Thanks a lot