When using the whitespace tokenizer, the stop words filter doesn't work. Here are the CURL commands to replicate:
PUT /my_index
{
"settings": {
"index": {
"number_of_shards": 1,
"analysis": {
"analyzer": {
"fulltext":{
"type":"custom",
"tokenizer":"whitespace",
"filter": ["english_stop"]
}
},
"filter":{
"english_stop":{
"type":"stop",
"stopwords":"_english_"
}
}
}
}
}
}
}
GET my_index/_analyze?analyzer=fulltext&text="the drug"
I need to be able to use the whitespace tokenizer because I'm also using the word_delim filter which turns terms like "wi-fi" to wifi, if I use the standard tokenizer, I will lose this ability.