Possessive_english stemmer ignoring proctected keywords in Elasticsearch

Sarah_Buchinger · April 15, 2024, 8:12am

Hello !

Would love to get your help with the following.

I want to use the possessive_english stemmer with an exception list, to prevent the removal of the ending " 's " in specific words.
However, the possessive_english stemmer ignores exception lists.

For example, the word "french's", even when marked as a keyword, will get stemmed to "french" :

POST /_analyze
{
  "text": "french's",
  "tokenizer": "standard",
  "filter": [
    {
      "keywords": ["french's"],
      "type": "keyword_marker"
    },
    {
      "type": "stemmer",
      "language": "possessive_english"
    }

Result :

{
  "tokens": [
    {
      "token": "french",
      "start_offset": 0,
      "end_offset": 8,
      "type": "<ALPHANUM>",
      "position": 0
    }
  ]
}

In comparison, the word "standing", when marked as a keyword will not get stemmed with the english stemmer.

POST /_analyze
{
  "text": "standing",
  "tokenizer": "standard",
  "filter": [
    {
      "keywords": ["standing"],     
      "type": "keyword_marker"  
    },
    {
      "type": "stemmer",
      "language": "english"
    }
  ]
}

Result :

{
  "tokens": [
    {
      "token": "standing",
      "start_offset": 0,
      "end_offset": 8,
      "type": "<ALPHANUM>",
      "position": 0
    }
  ]
}

Any insights on how to make the possessive_english stemmer consider an exception list ?

Thank you

Topic		Replies	Views
Is there any french lemmatizer available for ElasticSearch? Elasticsearch	3	810	May 25, 2017
Basic stemming problem - what am I missing? Elasticsearch	3	1470	July 5, 2017
Not able to see whether data stemmed or not! Elasticsearch	3	390	July 6, 2017
Elasticsearch Analyzer:Stemmer giving different results Elasticsearch	1	376	February 6, 2019
Language analyzer en français Discussions en français	7	1914	July 6, 2017

Possessive_english stemmer ignoring proctected keywords in Elasticsearch

Related topics