Hello,
You should try to use the search type dfs_query_then_fetch. It should make
the scoring much better on a small dataset.
More details :
Regards
Benjamin
On Fri, Mar 1, 2013 at 6:29 AM, Bruno Miranda bru.miranda@gmail.com wrote:
If I lower the shard count to 2, it's much more accurate, almost 100% of
the time. With 5 shards it falls down to about 75%.The reason I even started looking at such small dataset is because I was
seeing inconsistent results when using match. Query_string seems more
accurate as far as the scoring is concerned.using query_string I get the desired result order:
curl -X GET 'http://localhost:9200/search/_search?pretty' -d
'{
"query": {
"query_string": {
"query": "mark leighton"
}
}
}'["Leighton Mark", "Nancy Mark", "Lawrence Mark", "Louis Mark", "Kimberly
Leighton", "Barbara Leighton", "John Leighton", "Leighton Sweet"]Using match:
curl -X GET 'http://localhost:9200/search/_search?pretty' -d
'{
"query": {
"multi_match": {
"query": "mark leighton",
"fields": [
"last",
"first"
]
}
}
}'["Kimberly Leighton", "Barbara Leighton", "John Leighton", "Nancy Mark", "Lawrence
Mark", "Louis Mark", "Leighton Mark", "Leighton Sweet"]I get the totally wrong order.
I suppose my first issue was using match when I should be using query
string, and the second is the dataset is too small for anything over 2
shards.Any ideas?
On Thursday, February 28, 2013 7:15:13 PM UTC-8, Bruno Miranda wrote:
I am seeing the same exact index and the same exact query return
different results, could someone please help me understand?curl -X DELETE http://localhost:9200/search
curl -X POST http://localhost:9200/search -d '{
"mappings": {
"document": {
"properties": {
"first": {
"type": "string"
},
"last": {
"type": "string",
"boost": 2.0
}
}
}
}
}'curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Kimberly","last":"**Leighton"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Barbara","last":"**Leighton"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"John","last":"**Leighton"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Nancy","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Lawrence","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Louis","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Leighton","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Leighton","last":"**Sweet"}'
curl -X POST "http://localhost:9200/search/**_refreshhttp://localhost:9200/search/_refresh
"curl -X GET 'http://localhost:9200/search/**_search?pretty' -d '{
"query": {
"query_string": {
"query": "Mark Leighton"
}
}
}'RESULTS: ["Leighton Mark", "Louis Mark", "Kimberly Leighton", "Barbara
Leighton", "John Leighton", "Nancy Mark", "Lawrence Mark", "Leighton Sweet"]I run the same excact code again and the results are different
*
*
*
*
curl -X DELETE http://localhost:9200/search
curl -X POST http://localhost:9200/search -d '{
"mappings": {
"document": {
"properties": {
"first": {
"type": "string"
},
"last": {
"type": "string",
"boost": 2.0
}
}
}
}
}'curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Kimberly","last":"**Leighton"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Barbara","last":"**Leighton"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"John","last":"**Leighton"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Nancy","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Lawrence","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Louis","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Leighton","last":"**Mark"}'
curl -X POST "http://localhost:9200/search/**document/http://localhost:9200/search/document/"
-d '{"first":"Leighton","last":"**Sweet"}'
curl -X POST "http://localhost:9200/search/**_refreshhttp://localhost:9200/search/_refresh
"curl -X GET 'http://localhost:9200/search/**_search?pretty' -d '{
"query": {
"query_string": {
"query": "Mark Leighton"
}
}
}'RESULTS: ["Nancy Mark", "Leighton Mark", "Lawrence Mark", "Louis
Mark", "Leighton Sweet", "Kimberly Leighton", "Barbara Leighton", "John
Leighton"]The first time the top results was "Leighton Mark" as it should be
because it matches both terms. The same query seconds later returns a
different search result.Is scoring broken in Elasticsearch/Lucene?
Thank you.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.