Issue Querying on Specified Fields


(Mark Lilien) #1

I'm having an issue with a very specific set of circumstances that isn't making a whole lot of sense to me. I've done some digging and couldn't find anything, but if I'm missing something obvious, I do apologize.

I have a document type with the following mapping:

{
"customer": {
        "dynamic": "strict",
        "properties": {
          "archived": {
            "type": "boolean"
          },
          "blocked": {
            "type": "boolean"
          },
          "custom_field_values": {
            "type": "string",
            "analyzer": "snowball"
          },
          "display_phone_number": {
            "type": "string",
            "index": "not_analyzed"
          },
          "has_phone_number": {
            "type": "boolean"
          },
          "name": {
            "type": "string",
            "analyzer": "snowball"
          },
          "notes": {
            "type": "string",
            "analyzer": "snowball"
          },
          "organization_id": {
            "type": "string",
            "index": "not_analyzed"
          },
          "phone_number": {
            "type": "string",
            "analyzer": "snowball"
          },
          "sort_name": {
            "type": "string",
            "index": "not_analyzed"
          },
          "tag_ids": {
            "type": "string",
            "index": "not_analyzed"
          },
          "uuid": {
            "type": "string",
            "index": "not_analyzed"
          }
        }
      }
    }

And a document that looks like this:

{
        "uuid" : "b641dab0-aa38-409f-ab7d-2003cc159f41",
        "sort_name" : "wrong # - will rubio-mendoza",
        "display_phone_number" : "(222) 222-2222",
        "organization_id" : 123,
        "name" : "WRONG # - Will Rubio-Mendoza",
        "notes" : "",
        "phone_number" : [ "12222222222", "2222222222" ],
        "tag_ids" : [ 4513, 4512, 4514 ],
        "custom_field_values" : [ ],
        "blocked" : false,
        "archived" : false,
        "has_phone_number" : true
}

If I perform the search below, I get 0 results

{"query":{"query_string":{"query":"name:will"}}}

But if I remove the fields option, I get results for the document I listed above as well as several other "wills" in the system.

{"query":{"query_string":{"query":"will"}}}

This only seems to be an issue with the name "will." The below search returns the documented listed above and other "rubios" in the system.

{"query":{"query_string":{"query":"name:rubio"}}}

Is there something I'm missing here?

Thanks in advance


(Abdon Pijpelink) #2

You are using the snowball analyzer on the name field. This analyzer uses the stop token filter which removes common "stop words". This filter defaults to English stop words, and will is one of those stop words. As a result, the word will is not indexed for the name field, and you cannot find that word when you query that field.

When you omit the field in the query, you are actually query a metadata field called _all. This field contains the values of all fields concatenated into one big string. This _all field defaults to having the standard analyzer, which does not remove stop words. As a result, will is indexed for the _all field and you can find that word when you omit the field name.

In contrast, rubio is not a stop word, and can be found by querying the name field.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.