Intermittent scoring returned


(cyrilforce) #1

Hi all,

I have encountered a weird behaviour. I trying to execute same query twice
and it returns me different scoring of same document.
This is affecting the order of the result returned as what i need is sort
by relevancy.

*The query : *
{
"from" : 0,
"size" : 100,
"explain" : true,
"query" : {
"filtered" : {
"query" : {
"multi_match": {
"query": "happy",
"fields": [ "DISPLAY_NAME^6" ]
}
},
"filter" : {
"query" : {
"bool" : {
"must" : {
"term" : {
"CHANNEL_ID" : "1"
}
}
}
}
}
}
}
}

*Result on first run : *
"_shard": 2,
"_node": "h3g1u4jzT1KdJXDQF06qyA",
"_index": "jdbc_dev",
"_type": "media",
"_id": "15939",
"_score": 10.882705,
"_source": {
"DISPLAY_NAME": "Happy",
"PRICE": 1.5,
"AUDIO":
"http://musicube.thecube.my/base/preview/054/540152/540152-MP3-CMT-P.mp3",
"CHANNEL_ID": 1,
"CAT_PARENT": 557,
"MEDIA_ID": 15939,
"GENRE": "Happy",
"MEDIA_PKEY": "540152",
"COMPOSER": null,
"PLAYER": null,
"CATMEDIA_NAME": "Happy",
"FTID": null,
"VIEW_ID": 241,
"POSITION": 6162,
"ITEMCODE": "147237",
"CAT_ID": 558,
"PRIORITY": 100,
"CKEY": 533868,
"CATMEDIA_RANK": 3,
"BILLINGTYPE_ID": 1,
"CAT_NAME": "POP",
"KEYWORDS": null,
"LONG_DESCRIPTION": null,
"SHORT_DESCRIPTION": null,
"TYPE_ID": 76,
"ARTIST_GENDER": null,
"PERFORMER": "Diandra Arjunaidi",
"MAPPINGS": "1_241_558_POP_557_6162_1.5",
"SHORTCODE": "0401119513",
"CATMEDIA_CDATE": "2014-01-26T20:04:11.000Z",
"LANG_ID": 1
},
"_explanation": {
"value": 10.882705,
"description": "weight(DISPLAY_NAME:happy in 853)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.882705,
"description": "fieldWeight in 853, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with freq
of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.882705,
"description": "idf(docFreq=63,
maxDocs=1253673)"
},
{
"value": 1,
"description": "fieldNorm(doc=853)"
}
]
}
]
}

*Result on second run : *
"_shard": 2,
"_node": "UOjX2lxhR6mzfjHHmTm3cQ",
"_index": "jdbc_dev",
"_type": "media",
"_id": "15939",
"_score": 10.388683,
"_source": {
"DISPLAY_NAME": "Happy",
"PRICE": 1.5,
"AUDIO":
"http://musicube.thecube.my/base/preview/054/540152/540152-MP3-CMT-P.mp3",
"CHANNEL_ID": 1,
"CAT_PARENT": 557,
"MEDIA_ID": 15939,
"GENRE": "Happy",
"MEDIA_PKEY": "540152",
"COMPOSER": null,
"PLAYER": null,
"CATMEDIA_NAME": "Happy",
"FTID": null,
"VIEW_ID": 241,
"POSITION": 6162,
"ITEMCODE": "147237",
"CAT_ID": 558,
"PRIORITY": 100,
"CKEY": 533868,
"CATMEDIA_RANK": 3,
"BILLINGTYPE_ID": 1,
"CAT_NAME": "POP",
"KEYWORDS": null,
"LONG_DESCRIPTION": null,
"SHORT_DESCRIPTION": null,
"TYPE_ID": 76,
"ARTIST_GENDER": null,
"PERFORMER": "Diandra Arjunaidi",
"MAPPINGS": "1_241_558_POP_557_6162_1.5",
"SHORTCODE": "0401119513",
"CATMEDIA_CDATE": "2014-01-26T20:04:11.000Z",
"LANG_ID": 1
},
"_explanation": {
"value": 10.388683,
"description": "weight(DISPLAY_NAME:happy in 12392)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.388683,
"description": "fieldWeight in 12392, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with freq
of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.388683,
"description": "idf(docFreq=107,
maxDocs=1290854)"
},
{
"value": 1,
"description": "fieldNorm(doc=12392)"
}
]
}
]
}

The mapping :
{
"media": {
"properties": {
"AUDIO": {
"type": "string"
},
"BILLINGTYPE_ID": {
"type": "long"
},
"CATMEDIA_CDATE": {
"type": "date",
"format": "dateOptionalTime"
},
"CATMEDIA_NAME": {
"type": "string"
},
"CATMEDIA_RANK": {
"type": "long"
},
"CAT_ID": {
"type": "long"
},
"CAT_NAME": {
"type": "string",
"analyzer": "string_lowercase",
"include_in_all": true
},
"CAT_PARENT": {
"type": "long"
},
"CHANNEL_ID": {
"type": "long"
},
"CKEY": {
"type": "long"
},
"DISPLAY_NAME": {
"type": "string"
},
"FTID": {
"type": "string"
},
"GENRE": {
"type": "string"
},
"ITEMCODE": {
"type": "string"
},
"KEYWORDS": {
"type": "string"
},
"LANG_ID": {
"type": "long"
},
"LONG_DESCRIPTION": {
"type": "string"
},
"MAPPINGS": {
"type": "string",
"analyzer": "string_lowercase",
"include_in_all": true
},
"MEDIA_ID": {
"type": "long"
},
"MEDIA_PKEY": {
"type": "string"
},
"PERFORMER": {
"type": "string"
},
"PLAYER": {
"type": "string"
},
"POSITION": {
"type": "long"
},
"PRICE": {
"type": "double"
},
"PRIORITY": {
"type": "long"
},
"SHORTCODE": {
"type": "string"
},
"SHORT_DESCRIPTION": {
"type": "string"
},
"TYPE_ID": {
"type": "long"
},
"VIEW_ID": {
"type": "long"
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3507a1ac-3451-44f0-afd5-f16db6865fb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(cyrilforce) #2

It seems like the first query is searching from the primary shard while the
the second time query is searching from secondary/replica shard.

*result from 1st query : *
"_shard": 1,
"_node": "VYQt633MTUuJdAwL--PE3A",
"_index": "jdbc_dev",
"_type": "media",
"_id": "380473",
"_source": {
"DISPLAY_NAME": "Be Happy",
.......
"_explanation": {
"value": 10.873312,
"description": "weight(DISPLAY_NAME:happy in 1517)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.873312,
"description": "fieldWeight in 1517, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with freq
of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.873312,
"description": "idf(docFreq=63,
maxDocs=1241952)"
},
{
"value": 1,
"description": "fieldNorm(doc=1517)"
}
]
}
]
}

"_shard": 2,
"_node": "UOjX2lxhR6mzfjHHmTm3cQ",
"_index": "jdbc_dev",
"_type": "media",
"_id": "15939",
"_score": 10.388683,
"_source": {
"DISPLAY_NAME": "Happy",
.....
"_explanation": {
"value": 10.388683,
"description": "weight(DISPLAY_NAME:happy in 12392)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.388683,
"description": "fieldWeight in 12392, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with freq
of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.388683,
"description": "idf(docFreq=107,
maxDocs=1290854)"
},
{
"value": 1,
"description": "fieldNorm(doc=12392)"
}
]
}
]
}


*result from second query : *
"_shard": 2,
"_node": "h3g1u4jzT1KdJXDQF06qyA",
"_index": "jdbc_dev",
"_type": "media",
"_id": "15939",
"_score": 10.882705,
"_source": {
"DISPLAY_NAME": "Happy",
.........
"_explanation": {
"value": 10.882705,
"description": "weight(DISPLAY_NAME:happy in 853)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.882705,
"description": "fieldWeight in 853, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with freq
of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.882705,
"description": "idf(docFreq=63,
maxDocs=1253673)"
},
{
"value": 1,
"description": "fieldNorm(doc=853)"
}
]
}
]
}
}

"_shard": 1,
"_node": "kr37FCksStOKW5ZCo6PCwQ",
"_index": "jdbc_dev",
"_type": "media",
"_id": "380473",
"_score": 10.222518,
"_source": {
"DISPLAY_NAME": "Be Happy",
.........
"_explanation": {
"value": 10.222518,
"description": "weight(DISPLAY_NAME:happy in 3804)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.222518,
"description": "fieldWeight in 3804, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with freq
of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.222518,
"description": "idf(docFreq=123,
maxDocs=1255192)"
},
{
"value": 1,
"description": "fieldNorm(doc=3804)"
}
]
}
]
}
}

Now the question is why different shard influence the scoring/relevancy so
much until that it swap the order eventhough i am using the same query
string and the document return is the same document but resides in
different shards. Is there anyway to disable searching from primary or
secondary/replica shard ?
Also why the first query return "be happy" instead of "happy" as i
didn't disable the "norm"

On Thu, Apr 3, 2014 at 7:01 PM, cyrilforce cheehoo84@gmail.com wrote:

Hi all,

I have encountered a weird behaviour. I trying to execute same query twice
and it returns me different scoring of same document.
This is affecting the order of the result returned as what i need is sort
by relevancy.

*The query : *
{
"from" : 0,
"size" : 100,
"explain" : true,
"query" : {
"filtered" : {
"query" : {
"multi_match": {
"query": "happy",
"fields": [ "DISPLAY_NAME^6" ]
}
},
"filter" : {
"query" : {
"bool" : {
"must" : {
"term" : {
"CHANNEL_ID" : "1"
}
}
}
}
}
}
}
}

*Result on first run : *
"_shard": 2,
"_node": "h3g1u4jzT1KdJXDQF06qyA",
"_index": "jdbc_dev",
"_type": "media",
"_id": "15939",
"_score": 10.882705,
"_source": {
"DISPLAY_NAME": "Happy",
"PRICE": 1.5,
"AUDIO": "
http://musicube.thecube.my/base/preview/054/540152/540152-MP3-CMT-P.mp3",
"CHANNEL_ID": 1,
"CAT_PARENT": 557,
"MEDIA_ID": 15939,
"GENRE": "Happy",
"MEDIA_PKEY": "540152",
"COMPOSER": null,
"PLAYER": null,
"CATMEDIA_NAME": "Happy",
"FTID": null,
"VIEW_ID": 241,
"POSITION": 6162,
"ITEMCODE": "147237",
"CAT_ID": 558,
"PRIORITY": 100,
"CKEY": 533868,
"CATMEDIA_RANK": 3,
"BILLINGTYPE_ID": 1,
"CAT_NAME": "POP",
"KEYWORDS": null,
"LONG_DESCRIPTION": null,
"SHORT_DESCRIPTION": null,
"TYPE_ID": 76,
"ARTIST_GENDER": null,
"PERFORMER": "Diandra Arjunaidi",
"MAPPINGS": "1_241_558_POP_557_6162_1.5",
"SHORTCODE": "0401119513",
"CATMEDIA_CDATE": "2014-01-26T20:04:11.000Z",
"LANG_ID": 1
},
"_explanation": {
"value": 10.882705,
"description": "weight(DISPLAY_NAME:happy in 853)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.882705,
"description": "fieldWeight in 853, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with
freq of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.882705,
"description": "idf(docFreq=63,
maxDocs=1253673)"
},
{
"value": 1,
"description": "fieldNorm(doc=853)"
}
]
}
]
}

*Result on second run : *
"_shard": 2,
"_node": "UOjX2lxhR6mzfjHHmTm3cQ",
"_index": "jdbc_dev",
"_type": "media",
"_id": "15939",
"_score": 10.388683,
"_source": {
"DISPLAY_NAME": "Happy",
"PRICE": 1.5,
"AUDIO": "
http://musicube.thecube.my/base/preview/054/540152/540152-MP3-CMT-P.mp3",
"CHANNEL_ID": 1,
"CAT_PARENT": 557,
"MEDIA_ID": 15939,
"GENRE": "Happy",
"MEDIA_PKEY": "540152",
"COMPOSER": null,
"PLAYER": null,
"CATMEDIA_NAME": "Happy",
"FTID": null,
"VIEW_ID": 241,
"POSITION": 6162,
"ITEMCODE": "147237",
"CAT_ID": 558,
"PRIORITY": 100,
"CKEY": 533868,
"CATMEDIA_RANK": 3,
"BILLINGTYPE_ID": 1,
"CAT_NAME": "POP",
"KEYWORDS": null,
"LONG_DESCRIPTION": null,
"SHORT_DESCRIPTION": null,
"TYPE_ID": 76,
"ARTIST_GENDER": null,
"PERFORMER": "Diandra Arjunaidi",
"MAPPINGS": "1_241_558_POP_557_6162_1.5",
"SHORTCODE": "0401119513",
"CATMEDIA_CDATE": "2014-01-26T20:04:11.000Z",
"LANG_ID": 1
},
"_explanation": {
"value": 10.388683,
"description": "weight(DISPLAY_NAME:happy in 12392)
[PerFieldSimilarity], result of:",
"details": [
{
"value": 10.388683,
"description": "fieldWeight in 12392, product
of:",
"details": [
{
"value": 1,
"description": "tf(freq=1.0), with
freq of:",
"details": [
{
"value": 1,
"description": "termFreq=1.0"
}
]
},
{
"value": 10.388683,
"description": "idf(docFreq=107,
maxDocs=1290854)"
},
{
"value": 1,
"description": "fieldNorm(doc=12392)"
}
]
}
]
}

The mapping :
{
"media": {
"properties": {
"AUDIO": {
"type": "string"
},
"BILLINGTYPE_ID": {
"type": "long"
},
"CATMEDIA_CDATE": {
"type": "date",
"format": "dateOptionalTime"
},
"CATMEDIA_NAME": {
"type": "string"
},
"CATMEDIA_RANK": {
"type": "long"
},
"CAT_ID": {
"type": "long"
},
"CAT_NAME": {
"type": "string",
"analyzer": "string_lowercase",
"include_in_all": true
},
"CAT_PARENT": {
"type": "long"
},
"CHANNEL_ID": {
"type": "long"
},
"CKEY": {
"type": "long"
},
"DISPLAY_NAME": {
"type": "string"
},
"FTID": {
"type": "string"
},
"GENRE": {
"type": "string"
},
"ITEMCODE": {
"type": "string"
},
"KEYWORDS": {
"type": "string"
},
"LANG_ID": {
"type": "long"
},
"LONG_DESCRIPTION": {
"type": "string"
},
"MAPPINGS": {
"type": "string",
"analyzer": "string_lowercase",
"include_in_all": true
},
"MEDIA_ID": {
"type": "long"
},
"MEDIA_PKEY": {
"type": "string"
},
"PERFORMER": {
"type": "string"
},
"PLAYER": {
"type": "string"
},
"POSITION": {
"type": "long"
},
"PRICE": {
"type": "double"
},
"PRIORITY": {
"type": "long"
},
"SHORTCODE": {
"type": "string"
},
"SHORT_DESCRIPTION": {
"type": "string"
},
"TYPE_ID": {
"type": "long"
},
"VIEW_ID": {
"type": "long"
}
}
}
}

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/gQJD9oZNCpM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3507a1ac-3451-44f0-afd5-f16db6865fb1%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/3507a1ac-3451-44f0-afd5-f16db6865fb1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Regards,

Chee Hoo

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGS0%2Bg-N8jFZFRdg8kCSurhNUD95BfNRz_RwxGJudLr-Q_S4CA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Binh Ly-2) #3

The scores may differ from primary to replica because the scoring (at the
moment) takes into account information from deleted documents. Since
segment merging happens independently between shards (or primaries and
replicas), documents marked as deleted can vary among them.

You can use ?preference to somewhat influence how/where you are searching
in terms of primaries/replicas:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html#search-request-preference

So something like:

curl localhost:9200/_search?preference=xxx

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/937245a2-fa6a-4889-af85-db9d42eb184a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4