Hi everyone, like mentionned in the title, i'm facing the problem with simple query string when WHITESPACE is not specified in flags.
I'm using docker.elastic.co/elasticsearch/elasticsearch:7.4.1
I have a mapping like below:
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"english": {
"type": "custom",
"char_filter": "html_strip",
"tokenizer": "icu_tokenizer",
"filter": [
"english_possessive_stemmer",
"lowercase",
"icu_folding",
"english_stop",
"english_stemmer"
]
}
},
"filter": {
"english_stop": {
"type": "stop",
"stopwords": "_english_",
"remove_trailing": false
},
"english_stemmer": {
"type": "stemmer",
"language": "english"
},
"english_possessive_stemmer": {
"type": "stemmer",
"language": "possessive_english"
}
}
}
},
"mappings": {
"properties":{
"translations":{
"properties":{
"en": {
"type": "text",
"term_vector": "with_positions_offsets",
"analyzer": "english"
}
}
},
"elastic_translations":{
"properties":{
"en": {
"type": "text",
"analyzer": "english"
}
}
},
"elastic_case_title":{
"properties":{
"en": {
"type": "text",
"analyzer": "english"
}
}
}
}
}
}
PUT test/_doc/1
{
"translations": {
"en": "one two three of four"
},
"elastic_translations":{
"en": "four of five"
},
"elastic_case_title":{
"en": "five of six"
}
}
PUT test/_doc/2
{
"translations": {
"en": "two three of four"
},
"elastic_translations":{
"en": "three of four"
},
"elastic_case_title":{
"en": "five of six"
}
}
PUT test/_doc/3
{
"translations": {
"en": "six of seven"
},
"elastic_translations":{
"en": "three of"
},
"elastic_case_title":{
"en": "eight of nine"
}
}
and when i launch this request (with default_operator: or
and all flags enabled
) it works fine:
GET test/_search
{
"query": {
"bool": {
"should": [
{
"simple_query_string": {
"query": "one two of three",
"fields": [
"elastic_case_title.en^5",
"elastic_translations.en^3",
"translations.en"
]
}
}
]
}
},
"highlight": {
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields" : {
"translations.en": {},
"elastic_case_title.en": {},
"elastic_translations.en": {}
}
}
}
but then i have another problem with keyword and operator and, so that i followed this solution:
as solution proposed by @jimczi, i remove WHITESPACE flag like below:
GET test/_search
{
"query": {
"bool": {
"should": [
{
"simple_query_string": {
"query": "one two of three",
"fields": [
"elastic_case_title.en^5",
"elastic_translations.en^3",
"translations.en"
],
"flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
"default_operator": "AND"
}
}
]
}
},
"highlight": {
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields" : {
"translations.en": {},
"elastic_case_title.en": {},
"elastic_translations.en": {}
}
}
}
it work fine (i have doc 1 as result) until i give the query string minus (-) (aka prohibited clause) like this:
GET test/_search
{
"query": {
"bool": {
"should": [
{
"simple_query_string": {
"query": "-one two of three",
"fields": [
"elastic_case_title.en^5",
"elastic_translations.en^3",
"translations.en"
],
"flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
"default_operator": "AND"
}
}
]
}
},
"highlight": {
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields" : {
"translations.en": {},
"elastic_case_title.en": {},
"elastic_translations.en": {}
}
}
}
or this one ( i followed the solution on elastic documention, as i give a (+) before (-))
GET test/_search
{
"query": {
"bool": {
"should": [
{
"simple_query_string": {
"query": "+-one two of three",
"fields": [
"elastic_case_title.en^5",
"elastic_translations.en^3",
"translations.en"
],
"flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
"default_operator": "AND"
}
}
]
}
},
"highlight": {
"pre_tags": [
"<mark>"
],
"post_tags": [
"</mark>"
],
"fields" : {
"translations.en": {},
"elastic_case_title.en": {},
"elastic_translations.en": {}
}
}
}
both queries dont work, it give me doc 2 and doc 3 (expected only doc 2) and i lost my highlight too.
when i use explain,
GET test/_explain/3(or2)
{
"query": {
"bool": {
"should": [
{
"simple_query_string": {
"query": "+-one two of three",
"fields": [
"elastic_case_title.en^5",
"elastic_translations.en^3",
"translations.en"
],
"flags": "AND|ESCAPE|NEAR|NOT|OR|PHRASE|PRECEDENCE|PREFIX|SLOP",
"default_operator": "AND"
}
}
]
}
}
}
it give me nothing:
{
"_index" : "test",
"_type" : "_doc",
"_id" : "3",
"matched" : true,
"explanation" : {
"value" : 1.0,
"description" : "sum of:",
"details" : [
{
"value" : 1.0,
"description" : "*:*",
"details" : [ ]
}
]
}
}
i found a same issue on elastic github but no solution provided:
Do you guys have any explanation ?
Otherwise, do you have any solution for filtering stopword, while using and operator ?
@jimczi im sorry if i bother you, but your explanation in this issue is so clear, i wonder if you can help me this team. So thank you.
Thank you guys.