Analyser doesn't remove English stopwords


#1

Hello guys,

I'm trying to reindex to add an English stopwords filter, but after running a simple terms aggregation, words like "the" or "a" are still on top of the list. What might be the problem?

This is my mapping:

PUT myIndex
    {
        "order": 0,
        "settings": {
            "index": {
                "analysis": {
                    "analyzer": {
                        "my_analyzer": {
                            "filter": [
                                "lowercase",
                                "trim",
                                "stemmer_filter",
                                "stopwords_filter",
                                "reverse"
                            ],
                            "type": "custom",
                            "tokenizer": "standard"
                        }
                    },
                    "filter": {
                      "stopwords_filter": {
                        "type": "stop",
                        "stopwords": "_english_"
                      },
                      "stemmer_filter": {
                        "type": "stemmer",
                        "name": "english"
                      }
                    }
                },
                "number_of_shards": "1",
                "number_of_replicas": "0"
            }
        },
        "mappings": {
            "keywords": {
                "_all": {
                    "enabled": false
                },
                "properties": {
                    "words": {
                        "fielddata": true,
                        "store": true,
                        "eager_global_ordinals": true,
                        "type": "text",
                        "fields": {
                            "reverse": {
                                "search_analyzer": "my_analyzer"
                                "analyzer": "my_analyzer",
                                "type": "text"
                            },
                            "raw": {
                              "type": "keyword"
                            }
                        }
                    }
                }
            }
        },
        "aliases": {}
    }'

(Loren Siebert) #2

You'd have to aggregate on the words.reverse field, since that's the one that uses your analyzer.


#3

Works now, thank you!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.