Query with synonym doesn't work as expected

hello,

here some sample data:

settings and mappings:
{ "settings":{ "analysis":{ "filter":{ "my_synonym_filter":{ "type":"synonym", "synonyms": [ "synonym0 => owl shirt" ] } }, "analyzer":{ "test_analyzer":{ "type":"custom", "tokenizer":"standard", "filter":[ "my_synonym_filter" ] } } } }, "mappings":{ "test":{ "properties":{ "id":{ "type":"string", "index":"not_analyzed" }, "field0":{ "type":"string", "index":"analyzed" }, "field1":{ "type":"string", "index":"analyzed" }, "field2":{ "type":"string", "index":"analyzed" } } } } }

data:
{"index":{"_type":"test","_id":"0"}} {"field0":"product 0","field1":"","field2":"owl,shirt"} {"index":{"_type":"test","_id":"1"}} {"field0":"product 1","field1":"shirt","field2":"owl"} {"index":{"_type":"test","_id":"2"}} {"field0":"product 2","field1":"shirt","field2":"horse"} {"index":{"_type":"test","_id":"3"}} {"field0":"product 3","field1":"longsleeve","field2":"penguin"}

when i search like this:

POST /syntest/test/_search?pretty { "query":{ "query_string":{ "query":"shirt owl", "default_operator":"AND", "analyzer":"test_analyzer", "fields":[ "field1", "field2" ] } } }

i get as aspected all results which contains shirt and owl in fields field1 and field2:

{ "took": 7, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2, "max_score": 0.13316044, "hits": [ { "_index": "syntest", "_type": "test", "_id": "1", "_score": 0.13316044, "_source": { "field0": "product 1", "field1": "shirt", "field2": "owl" } }, { "_index": "syntest", "_type": "test", "_id": "0", "_score": 0.08322528, "_source": { "field0": "product 0", "field1": "", "field2": "owl,shirt" } } ] } }

but when i search with the defined synonym synonym0 i get just one result:

POST /syntest/test/_search?pretty { "query":{ "query_string":{ "query":"synonym0", "default_operator":"AND", "analyzer":"test_analyzer", "fields":[ "field1", "field2" ] } } }

{ "took": 6, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.08322528, "hits": [ { "_index": "syntest", "_type": "test", "_id": "0", "_score": 0.08322528, "_source": { "field0": "product 0", "field1": "", "field2": "owl,shirt" } } ] } }

what is the problem?

tested with elasticsearch 2.2.

tanks

Check out the _explain API to debug search issues.
Here I use it to test why doc id #1 fails:

POST /test/test/1/_explain
{
   "query": {
	  "query_string": {
		 "query": "synonym0",
		 "default_operator": "AND",
		 "analyzer": "test_analyzer",
		 "fields": [
			"field1",
			"field2"
		 ]
	  }
   }
}

The result (partially shown here) shows why this fails for doc 1:

{
   "_index": "test",
   "_type": "test",
   "_id": "1",
   "matched": false,
   "explanation": {
	  "value": 0,
	  "description": "Failure to meet condition(s) of required/prohibited clause(s)",
	  "details": [
		 {
			"value": 0,
			"description": "no match on required clause (((+field1:owl +field1:shirt) | (+field2:owl +field2:shirt)))",
			...
		 }...
	  ]
   }
}

It is testing for both the words in one field. Instead of query_string consider using the match query which offers more control over matching options for multiple fields - cross_fields, best_fields etc.

Cheers
Mark

hello mark,

thanks for the reply.

now i tested with multi_match + cross_fields:

POST /syntest/test/_search { "query": { "multi_match": { "query": "owl shirt", "type": "cross_fields", "fields": [ "field0", "field1", "field2" ], "operator": "AND" } } }

this works as expected:

{ "took": 12, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2, "max_score": 0.8838835, "hits": [ { "_index": "syntest", "_type": "test", "_id": "0", "_score": 0.8838835, "_source": { "field0": "product 0", "field1": "", "field2": "owl,shirt" } }, { "_index": "syntest", "_type": "test", "_id": "1", "_score": 0.13316044, "_source": { "field0": "product 1", "field1": "shirt", "field2": "owl" } } ] } }

but with the synonym it returns null:

POST /syntest/test/_search { "query": { "multi_match": { "query": "synonym0", "type": "cross_fields", "fields": [ "field0", "field1", "field2" ], "operator": "AND" } } }

{ "took": 4, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } }

explain:

... "explanation": { "value": 0, "description": "Failure to meet condition(s) of required/prohibited clause(s)", "details": [ { "value": 0, "description": "no match on required clause ((field2:synonym0 | field1:synonym0 | field0:synonym0))", ...

any idea?

Looks like the wrong analyzer. Given all the fields share a common analyzer I'm not sure why this would be the case. Try set the "analyzer" parameter on the multi_match expression to force the choice of analyzer.

oh dear!
that's my fault.

i forget to set the analyzer parameter.

thank you

Hi,

Synonym search is working sometimes and not working sometimes. I'm not able to find the root cause of that.
I'm using synonym at search analyzer. Earlier what all the synonym related records got score is different from now.

Can someone help me in understanding the issue exactly as soon as possible.

Thanks,
@vaniaravinda