Filtering out stop words at search time only


#1

Problem:
I'm using a simple match query to return some results with filtered stop words. But if a stop word is searched for with a simple match query, it will have 0 hits since it's unable to match stop words which are already filtered out.

This is my analyser:

                "stopwords_analyzer": {
                    "filter": [
                        "lowercase",
                        "trim",
                        "stopwords_filter"
                    ],
                    "type": "custom",
                    "tokenizer": "standard"

It works great in all use cases except when when trying to match stop words like:

"query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {"match": {"word.stopwords": "the great pyramid"}}
          ]
        }
      }
    }
  }

It would match results for "great pyramid", but return 0 results for "the great pyramid".

Unsuccessful solution:
Tried adding another analyser without stopwords filter as a index_analyser, and using the existing stopwords_analyser as search_analyser.

                "normal_analyzer": {
                    "filter": [
                        "lowercase",
                        "trim"
                    ],
                    "type": "custom",
                    "tokenizer": "standard"

From my understanding, stop words should be available after indexing, but should be filtered out when searching with queries. Yet, now it returns all results, including stop words.

Why doesn't this work? Is there an another way to search for stop words, while excluding them from results?


(Mark Harwood) #2

Not sure I follow the problem but a more complete reconstruction would help.
This is working for me:

PUT test
{
  "settings": {
	"analysis": {
	  "analyzer": {
		"stopwords_analyzer": {
		  "filter": [
			"lowercase",
			"trim",
			"stopwords_filter"
		  ],
		  "type": "custom",
		  "tokenizer": "standard"
		}
	  },
	  "filter": {
		"stopwords_filter": {
		  "type": "stop",
		  "stopwords": [
			"the",
			"great"
		  ]
		}
	  }
	}
  },
  "mappings": {
	"_doc": {
	  "properties": {
		"word": {
		  "type": "text",
		  "analyzer": "stopwords_analyzer"
		}
	  }
	}
  }
}
POST test/_doc
{
  "word":"the great pyramid"
}
POST test/_search
{
  "query": {
	"match": {
	  "word": "the great pyramid"
	}
  }
}

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.