Simple query string: minus (-) doesn't work if WHITESPACE not specified in flags

Hi everyone, like mentionned in the title, i'm facing the problem with simple query string when WHITESPACE is not specified in flags.
I'm using docker.elastic.co/elasticsearch/elasticsearch:7.4.1
I have a mapping like below:

PUT test 
{
  "settings": {
    "analysis": {
      "analyzer": {
        "english": {
          "type": "custom",
          "char_filter": "html_strip",
          "tokenizer": "icu_tokenizer",
          "filter": [
            "english_possessive_stemmer",
            "lowercase",
            "icu_folding",
            "english_stop",
            "english_stemmer"
          ]
        }
      },
      "filter": {
        "english_stop": {
          "type": "stop",
          "stopwords": "_english_",
          "remove_trailing": false
        },
        "english_stemmer": {
          "type": "stemmer",
          "language": "english"
        },
        "english_possessive_stemmer": {
          "type": "stemmer",
          "language": "possessive_english"
        }
      }
    }
  },
  "mappings": {
    "properties":{
      "translations":{
      "properties":{
        "en": {
          "type": "text",
          "term_vector": "with_positions_offsets",
          "analyzer": "english"
        }
      }
    },
      "elastic_translations":{
      "properties":{
        "en": {
          "type": "text",
          "analyzer": "english"
        }
      }
    },
      "elastic_case_title":{
      "properties":{
        "en": {
          "type": "text",
          "analyzer": "english"
        }
      }
    }
    }
  }
}

PUT test/_doc/1
{
  "translations": {
    "en": "one two three of four"
  },
  "elastic_translations":{
    "en": "four of five"
  },
  "elastic_case_title":{
    "en": "five of six"
  }
}

PUT test/_doc/2
{
  "translations": {
    "en": "two three of four"
  },
  "elastic_translations":{
    "en": "three of four"
  },
  "elastic_case_title":{
    "en": "five of six"
  }
}

PUT test/_doc/3
{
  "translations": {
    "en": "six of seven"
  },
  "elastic_translations":{
    "en": "three of"
  },
  "elastic_case_title":{
    "en": "eight of nine"
  }
}

and when i launch this request (with default_operator: or and all flags enabled) it works fine:

GET test/_search
{
  "query": {
    "bool": {
      "should": [
        {
        "simple_query_string": {
          "query": "one two of three",
          "fields": [
            "elastic_case_title.en^5",
            "elastic_translations.en^3",
            "translations.en"
          ]
        }
      }
      ]
    }
  },
  "highlight": {
    "pre_tags": [
      "<mark>"
    ],
    "post_tags": [
      "</mark>"
    ],
    "fields" : {
      "translations.en": {},
      "elastic_case_title.en": {},
      "elastic_translations.en": {}
    }
  }
}

but then i have another problem with keyword and operator and, so that i followed this solution:


as solution proposed by @jimczi, i remove WHITESPACE flag like below:
GET test/_search
{
  "query": {
    "bool": {
      "should": [
        {
        "simple_query_string": {
          "query": "one two of three",
          "fields": [
            "elastic_case_title.en^5",
            "elastic_translations.en^3",
            "translations.en"
          ],
          "flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
          "default_operator": "AND"
        }
      }
      ]
    }
  },
  "highlight": {
    "pre_tags": [
      "<mark>"
    ],
    "post_tags": [
      "</mark>"
    ],
    "fields" : {
      "translations.en": {},
      "elastic_case_title.en": {},
      "elastic_translations.en": {}
    }
  }
}

it work fine (i have doc 1 as result) until i give the query string minus (-) (aka prohibited clause) like this:

GET test/_search
{
  "query": {
    "bool": {
      "should": [
        {
        "simple_query_string": {
          "query": "-one two of three",
          "fields": [
            "elastic_case_title.en^5",
            "elastic_translations.en^3",
            "translations.en"
          ],
          "flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
          "default_operator": "AND"
        }
      }
      ]
    }
  },
  "highlight": {
    "pre_tags": [
      "<mark>"
    ],
    "post_tags": [
      "</mark>"
    ],
    "fields" : {
      "translations.en": {},
      "elastic_case_title.en": {},
      "elastic_translations.en": {}
    }
  }
}

or this one ( i followed the solution on elastic documention, as i give a (+) before (-))

GET test/_search
{
  "query": {
    "bool": {
      "should": [
        {
        "simple_query_string": {
          "query": "+-one two of three",
          "fields": [
            "elastic_case_title.en^5",
            "elastic_translations.en^3",
            "translations.en"
          ],
          "flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
          "default_operator": "AND"
        }
      }
      ]
    }
  },
  "highlight": {
    "pre_tags": [
      "<mark>"
    ],
    "post_tags": [
      "</mark>"
    ],
    "fields" : {
      "translations.en": {},
      "elastic_case_title.en": {},
      "elastic_translations.en": {}
    }
  }
}

both queries dont work, it give me doc 2 and doc 3 (expected only doc 2) and i lost my highlight too.
when i use explain,

GET test/_explain/3(or2)
{
  "query": {
    "bool": {
      "should": [
        {
        "simple_query_string": {
          "query": "+-one two of three",
          "fields": [
            "elastic_case_title.en^5",
            "elastic_translations.en^3",
            "translations.en"
          ],
          "flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
          "default_operator": "AND"
        }
      }
      ]
    }
  }
}

it give me nothing:

{
  "_index" : "test",
  "_type" : "_doc",
  "_id" : "3",
  "matched" : true,
  "explanation" : {
    "value" : 1.0,
    "description" : "sum of:",
    "details" : [
      {
        "value" : 1.0,
        "description" : "*:*",
        "details" : [ ]
      }
    ]
  }
}

i found a same issue on elastic github but no solution provided:


Do you guys have any explanation ?
Otherwise, do you have any solution for filtering stopword, while using and operator ?
@jimczi im sorry if i bother you, but your explanation in this issue is so clear, i wonder if you can help me this team. So thank you.
Thank you guys.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.