Filter stop words in simple_query_string

Hi,
I have an index named news_headline with such mapping:

{
  "mappings": {
    "properties": {
      "authors": {
        "type": "text"
      },
      "category": {
        "type": "keyword"
      },
      "date": {
        "type": "date",
        "format": "iso8601"
      },
      "headline": {
        "type": "text"
      },
      "link": {
        "type": "keyword"
      },
      "short_description": {
        "type": "text"
      }
    }
  }
}

for example, consider the query below:

GET news_headlines/_search
{
  "query": {
    "simple_query_string" : {
        "query": "\"shape of you\" a song of Ed ",
        "fields": ["headline^2", "short_description"],
    }
  }
} 

How can I apply stop word on the query? I want to filter such words like a and of (the one that is out of double quotes).

The desired query response should be same as such query:

GET news_headlines/_search
{
  "query": {
    "simple_query_string" : {
        "query": "\"shape of you\" song Ed",
        "fields": ["headline^2", "short_description"],
    }
  }
} 

note that the of in the shape of you is still there!

Thanks in advance.

Hi @jahedi

Do you want to apply a stop words filter on the query? Why wouldn't using an analyzer with the stopwords filter work?

How can i use analyzer in this case? Can you help me?
I tried to add a analyzer named custom_english to my index like this:

"analysis": {
          "analyzer": {
            "custom_english": {
              "type": "standard",
              "stopwords": "_english_"
            }
          }
        },

So now the complete settings of this index is like this:

{
  "news_headlines": {
    "settings": {
      "index": {
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_content"
            }
          }
        },
        "number_of_shards": "1",
        "provided_name": "news_headlines",
        "creation_date": "1720937104190",
        "analysis": {
          "analyzer": {
            "custom_english": {
              "type": "standard",
              "stopwords": "_english_"
            }
          }
        },
        "number_of_replicas": "1",
        "uuid": "_V_MthiiR8-s_Ksor1WqXQ",
        "version": {
          "created": "8503000"
        }
      }
    }
  }
}

After adding the analyzer I tried to query like this:

GET news_headlines/_search
{
  "query": {
    "simple_query_string" : {
        "query": "\"shape of you\" a song of Ed ",
        "fields": ["headline^2", "short_description"],
        "analyzer": "custom_english"
    }
  }
} 

but didin't work. and the of inside the shape of you was eliminated and that is not i want.

The analyzer needs to be applied at index time. Which means it needs to be set when you create the index within the mapping section on each field where you want to use it.

It would be something like this described below. You define the analyzer using stop words and all data will be indexed without stop words in English.
When you run the query, your term will eliminate stop words.
For example, if you search just for "of" you will not have any results because this filter was applied in the indexing.

PUT idx_test
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_english": {
          "type": "standard",
          "stopwords": "_english_"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "headline": {
        "type": "text",
        "analyzer": "custom_english"
      }
    }
  }
}
1 Like

Nice. Thanks a lot. It worked!