Querying my index gives me an unexpected result. I have no explanation. Can anybody help me?
Q: http://localhost:9200/prototype00/user/_search?q=email:doesnotexist@sun.com.py
A: {"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.30617762,"hits":[{"_index":"prototype00","_type":"user","_id":"lC4zFucUSWGmuf_6q9NcXw","_score":0.30617762, "_source" : {"email":"test2@sun.com.py","apps":["prototype"],"isActive":false,"activationId":"c7e059fc-eb44-4b7a-aff3-f7cb2755bdeb"}}]}}
The email
field is probably analyzed, i.e. both the data and query terms are broken up into tokens. test2@sun.com.py is tokenized and indexed as (test2, sun, com, py) and your query term is tokenized as (doesnotexist, sun, com, py). The query basically finds the intersection of these tokens, which is the non-empty set (sun, com, py), so the document matches your query. Fields containing email addresses should probably be set up as not_analyzed since tokenization of email addresses rarely makes sense.
Thanks, you were right. I thought I get away without an explicit mapping.