Hello,
I want to create my own analyzer and I don't understand the difference between "filter": "stop" and the options "stopwords" which seem to be overlapping to me. Are they doing the same thing?
Examples from doc:
What is the difference between both?
Thank you,
Yoann
Hi @Yoann_Buzenet ,
In Elasticsearch, "filter": "stop" and "stopwords" are used to remove common, unimportant words like "the", "and", "a" during text analysis. "filter": "stop" is like a special tool you create to filter out these words, and you can use it in multiple places. On the other hand, "stopwords" is a direct list of words you tell Elasticsearch to ignore. Think of "filter": "stop" as a reusable coffee filter you can use with many cups, while "stopwords" is a specific list of words for one particular cup.
Simple Example:
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"my_stop_filter"
]
}
},
"filter": {
"my_stop_filter": {
"type": "stop",
"stopwords": ["the", "and", "a"]
}
}
}
}
}
In this example, we create an analyzer called my_analyzer that uses a filter named my_stop_filter to remove the words "the", "and", and "a" from the text during indexing.