Unexpected behavior when using nested query and score_mode = max

Hey,

I'm doing some discovery around the nested query feature elasticsearch
supports and I need to use the max score_mode.
I'm expecting to see document id 100 before 101 since id 100 has the full
phrase "Porsche Engine V6" but from some reason document id 101 appears
before in the search results.
Is there any explanations for that?

Thanks,
Itay

Query:
{
"query": {
"nested": {
"path": "Relations",
"score_mode": "max",
"query": {
"query_string": {
"fields": [
"Relations.Title"
],
"query": "Engine V6 porsche"
}
}
}
}
}

Data Set:
id 100
{
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "V6"
},
{
"Title": "BMW v3"
}
]
}

id 101
{
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Best Engine V6"
},
{
"Title": "Good condition Porsche Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

Result:
{
"took": 16,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.197319,
"hits": [
{
"_index": "searchtest_es",
"_type": "test",
"_id": "101",
"_score": 2.197319,
"_source": {
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Best Engine V6"
},
{
"Title": "Good condition Porsche Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
},
{
"_index": "searchtest_es",
"_type": "test",
"_id": "100",
"_score": 2.149176,
"_source": {
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "V6"
},
{
"Title": "BMW v3"
}
]
}
}
]
}
}

--

I also tried changing the documents to be exactly the same and the score is
not the same.
I really don't understand what's wrong (Note: I did flush the index before
query).

id 100
{
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

id 101
{
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

Result:
{
"took": 16,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.511222,
"hits": [
{
"_index": "searchtest_es",
"_type": "test",
"_id": "101",
"_score": 2.511222,
"_source": {
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
},
{
"_index": "searchtest_es",
"_type": "test",
"_id": "100",
"_score": 1.8294235,
"_source": {
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
}
]
}
}

--

You can activate explain option : http://www.elasticsearch.org/guide/reference/api/search/explain.html

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 18 déc. 2012 à 23:06, itay yahimovitz itay1336@gmail.com a écrit :

I also tried changing the documents to be exactly the same and the score is not the same.
I really don't understand what's wrong (Note: I did flush the index before query).

id 100
{
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

id 101
{
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

Result:
{
"took": 16,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.511222,
"hits": [
{
"_index": "searchtest_es",
"_type": "test",
"_id": "101",
"_score": 2.511222,
"_source": {
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
},
{
"_index": "searchtest_es",
"_type": "test",
"_id": "100",
"_score": 1.8294235,
"_source": {
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
}
]
}
}

--

I have tried that already, when I try to use explain it's failing:

{"took":15,"timed_out":false,"_shards":{"total":10,"successful":8,"failed":2,"failures":[{"status":500,"reason":"RemoteTransportException[[Blue
Diamond][inet[/10.109.76.76:9300]][search/phase/fetch/id]]; nested:
UnsupportedOperationException[org.elasticsearch.index.search.nested.BlockJoinQuery$BlockJoinWeight
cannot explain match on parent document];
"},{"status":500,"reason":"RemoteTransportException[[Blue
Diamond][inet[/10.109.76.76:9300]][search/phase/fetch/id]]; nested:
UnsupportedOperationException[org.elasticsearch.index.search.nested.BlockJoinQuery$BlockJoinWeight
cannot explain match on parent document];
"}]},"hits":{"total":2,"max_score":2.511222,"hits":}}

On Tuesday, December 18, 2012 9:57:27 PM UTC-8, David Pilato wrote:

You can activate explain option :
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 18 déc. 2012 à 23:06, itay yahimovitz <itay...@gmail.com <javascript:>>
a écrit :

I also tried changing the documents to be exactly the same and the score
is not the same.
I really don't understand what's wrong (Note: I did flush the index before
query).

id 100
{
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

id 101
{
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

Result:
{
"took": 16,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.511222,
"hits": [
{
"_index": "searchtest_es",
"_type": "test",
"_id": "101",
"_score": 2.511222,
"_source": {
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
},
{
"_index": "searchtest_es",
"_type": "test",
"_id": "100",
"_score": 1.8294235,
"_source": {
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
}
]
}
}

--

--

Is the Relations of type nested your mapping?

Also can you try running with search_type option set to
dfs_query_then_fetch in your query string. In the case just a few
documents are in an index with 5 primary shards, the score is off.
Using this option makes sure the score based on the right overall
document frequency. Note: In production the data is usually properly
distributed amongst the primary shards and the default search type is
fine.

The explain for the nested query isn't implemented in the current
released versions of ES. It is implemented in the master branch, so
the in next major version it should work. Each query has its own
explain function, not all query implement this yet. Also the query
explain varies per query implementation.

Martijn

On 19 December 2012 23:02, itay yahimovitz itay1336@gmail.com wrote:

I have tried that already, when I try to use explain it's failing:

{"took":15,"timed_out":false,"_shards":{"total":10,"successful":8,"failed":2,"failures":[{"status":500,"reason":"RemoteTransportException[[Blue
Diamond][inet[/10.109.76.76:9300]][search/phase/fetch/id]]; nested:
UnsupportedOperationException[org.elasticsearch.index.search.nested.BlockJoinQuery$BlockJoinWeight
cannot explain match on parent document];
"},{"status":500,"reason":"RemoteTransportException[[Blue
Diamond][inet[/10.109.76.76:9300]][search/phase/fetch/id]]; nested:
UnsupportedOperationException[org.elasticsearch.index.search.nested.BlockJoinQuery$BlockJoinWeight
cannot explain match on parent document];
"}]},"hits":{"total":2,"max_score":2.511222,"hits":}}

On Tuesday, December 18, 2012 9:57:27 PM UTC-8, David Pilato wrote:

You can activate explain option :
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 18 déc. 2012 à 23:06, itay yahimovitz itay...@gmail.com a écrit :

I also tried changing the documents to be exactly the same and the score
is not the same.
I really don't understand what's wrong (Note: I did flush the index before
query).

id 100
{
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

id 101
{
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}

Result:
{
"took": 16,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 2.511222,
"hits": [
{
"_index": "searchtest_es",
"_type": "test",
"_id": "101",
"_score": 2.511222,
"_source": {
"Title": "2005 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
},
{
"_index": "searchtest_es",
"_type": "test",
"_id": "100",
"_score": 1.8294235,
"_source": {
"Title": "2007 Porsche 911",
"Relations": [
{
"Title": "Porsche Engine V6"
},
{
"Title": "Best Engine V6"
},
{
"Title": "V6 porsche"
}
]
}
}
]
}
}

--

--

--
Met vriendelijke groet,

Martijn van Groningen

--