English stemmer bug

when i analyze "movies" word using english stemmer then it will give me movi. Other word like questions(question) or states(state) work correctly. below is the api that i use.

{
 "tokenizer": "standard",
 "text": "movies",
"filter": [
 "lowecase",
 {
  "type": "stemmer",
  "name": "english",
 }
]
}

if i use minimal_english stemmer then it will return "movy".

i am using ES-6.3.0

This is not a bug IMO.

Try lazy and you will get I think lazi.

Why do you think it's a problem ?

I think it should return movie instead of movi or movy. Like states return state and questions return question.

I don't understand "Try lazy and you will get I think lazi."

Why do you think so?
I mean what is the problem you want to solve?

This may be a Lucene/Elasticsearch thing then, cause stemming lazy to lazi doesn't make a lot of sense to me (as an english speaker). Same with movi.

But what's the problem?

As soon as lazy, laziness and alll other forms are translated to the same root, that should be ok, no?

True, it's just a weird representation.

The main problem is when i index string let say "jumanji movies on google play movies". Using english stemmer and when i search this english stemmer filed with "movie" it won't match but if i search with "movi" it will match if field use english stemmer and if i search with "movy" it will match if field use minimal_english stemmer but none of these match "movie" because stemmer tokenized "movies" word or token in "jumanji movies on google play movies" string into "movi" or "movy".

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.