How do I ensure 'stop' word are not picked up in my searches


(william crawley) #1

HI All,

I have created the following index:

    put newsindex
{
  "settings" : {
    "number_of_shards":3,
    "number_of_replicas":2,
    "analysis": {
 "filter": {
     "my_stop": {
         "type":      "stop",
        "stopwords":  "_english_"
     }
 }
        }
  },
  "mappings" : {
    "news": {
      "properties": {
        "newsid": {
          "type": "integer"
        },
        "newstype": {
          "type": "text"
        },
        "bodytext": {
          "type": "text"
        },
        "caption": {
          "type": "text"
        },
        "headline": {
          "type": "text"
        },
        "approved": {
          "type": "text"
    },
"author": {
  "type": "text"
},
"contact": {
  "type": "text"
},
"datecreated": {
  "type": "date",
  "format": "date_time"
},
"datesubmitted": {
  "type": "date",
  "format": "date_time"
},
"lastmodifieddate": {
  "type": "date",
  "format": "date_time"
}
  }
}
  }
}

Now when I perform a query, if I just use stop words such as

'is', 'it', 'the'

on their own in the search nothing is returned as expected. However, if I use a stop word with a non-stop word, then anything with the stop word will be returned along with those that have my non-stop word. so if I query against 'is finished' I have returned anything with 'is finished', 'finished' and 'is'. How do I stop those documents with just 'is' in them from being returned.


(David Pilato) #2

Could you provide a full recreation script as described in

It will help to better understand what you are doing.
Please, try to keep the example as simple as possible.


(william crawley) #3

I've amended my question to show how the index has been created. Using Kibana it's infact worse than I thought. In my app I am building up a wildcard search, but for a simple test in Kibana I did the following and thousands of hits were returned when I expected zero.

get newsindex/_search
{
  "query": { "query": {
    "bodytext": "and"
  } }
}

(David Pilato) #4

Could you try with GET and not get?

A full example would help


(william crawley) #5

HI dadoonet.

I tried with GET and it made no difference. I have also placed this up on stackoverflow. I've tried to introduce search analyzers with some strange results trying to get to the root of this problem
question in stackoverflow


(Adrien Grand) #6

The bodytext field does not specify an analyzer, so it is using the default analyzer which does not have stop words. You need to set an analyzer that removes stop words on the bodytext field.


(william crawley) #7

HI jpountz. can you please take a look at my link to stackoverflow my question there is more comprehensive. I have assigned the analyzer to the mapping of the property. When I do that I have even fewer documents returned than expected.


(system) closed #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.