Match query giving unexpected results

Hi,
I have an index which has following two addresses indexed using standard analyzer (default):

  1. 132 BURNT ASH ROAD LONDON LEWISHAM SE12 8PU
  2. 132-134 BURNT ASH ROAD LONDON LEWISHAM SE12 8PU

I am using following query to fetch the data:

{
	"query": {
		"bool": {
			"should": [{
				"match": {
					"fullAddress":  {"query": "132 Burnt Ash Road LONDON  SE12 8PU"}
				}
			}]
		}

	}
}

My expectation is that it should return me record no 1 first but it is returning record no 2 (as it has more score). Any idea what I am doing wrong here?

Thanks,
Ashish

Have a look at this blog post, if you want to understand more how the score can be influenced by the number of shards in an index. It's worth noting that scores are calculated at the individual shard level, and the final sorting brings back the highest scoring results from each shard. In particular with small data sets this can lead to unexpected results.

Can you re-index your data set into an index with just one primary shard?

1 Like

At the initial stage of filling the index, the relevance of documents may quite strongly depend on their distribution across the shards.

Try this mapping
PUT address
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"_doc" : {
"properties" : {
"fullAddress" : {
"type" : "text"
}
}
}
}
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.