Elasticsearch Analyzer:Stemmer giving different results

rahulnama · January 9, 2019, 6:19am

Hi Team

Below are my settings for custom analyzer

 "analyzer": {
    "my_analyzer": {
      "type": "custom",
      "tokenizer": "standard",
      "filter": [
        "possessive_stemmer",
        "lowercase",
        "english_stop",
        "eng_keywords",
        "stemmer"
      ]
    }
  },
  "filter": {
    "english_stop": {
      "type": "stop",
      "stopwords": ["have","should","i","a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with","my"]
    },
    "stemmer": {
      "type": "stemmer",
      "language": "light_english"
    },
    "possessive_stemmer": {
      "type": "stemmer",
      "language": "possessive_english"
    },
    "eng_keywords": {
      "type": "keyword_marker",
      "keywords": [
        "windows"
      ]
    }
  }
}

}

Have a doubt regarding stemmer.

If I use analyze api to understand how it works. the word running is being reduced to run but working is not being reduced to work. **is it because of light_stemmer ?

here are the results

POST newoneindex/_analyze
{
  "analyzer": "my_analyzer",
  "text" : "working jumping running"
}

{
  "tokens": [
    {
      "token": "working",
      "start_offset": 0,
      "end_offset": 7,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "jump",
      "start_offset": 8,
      "end_offset": 15,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "running",
      "start_offset": 16,
      "end_offset": 23,
      "type": "<ALPHANUM>",
      "position": 2
    }
  ]
}

system · February 6, 2019, 6:28am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Analyzers: default and custom analyzers giving same score Elasticsearch	2	400	November 20, 2018
Correctly set up index analyzer and search analyzer Elasticsearch	3	773	May 29, 2021
Custom analyzer: keyword_marker Elasticsearch	1	435	July 6, 2017
New language - Custom analyzer plugin or token filter Elasticsearch	1	545	March 21, 2017
Not able to see whether data stemmed or not! Elasticsearch	3	390	July 6, 2017

Elasticsearch Analyzer:Stemmer giving different results

Related topics