Hi Team
Below are my settings for custom analyzer
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"possessive_stemmer",
"lowercase",
"english_stop",
"eng_keywords",
"stemmer"
]
}
},
"filter": {
"english_stop": {
"type": "stop",
"stopwords": ["have","should","i","a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with","my"]
},
"stemmer": {
"type": "stemmer",
"language": "light_english"
},
"possessive_stemmer": {
"type": "stemmer",
"language": "possessive_english"
},
"eng_keywords": {
"type": "keyword_marker",
"keywords": [
"windows"
]
}
}
}
}
Have a doubt regarding stemmer.
If I use analyze api to understand how it works. the word running is being reduced to run but working is not being reduced to work. **is it because of light_stemmer ?
here are the results
POST newoneindex/_analyze
{
"analyzer": "my_analyzer",
"text" : "working jumping running"
}
{
"tokens": [
{
"token": "working",
"start_offset": 0,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "jump",
"start_offset": 8,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "running",
"start_offset": 16,
"end_offset": 23,
"type": "<ALPHANUM>",
"position": 2
}
]
}