Need help In Function Scoring

POST: http://localhost:9200/myindex/type/_search
{
"query": {
"function_score": {
"functions": [
{
"gauss": {
"b": {
"origin": "0",
"scale": "1000"
}
}, "weight":"2"
}
],
"query": {
"match": {"b":656}
},
"score_mode": "multiply"
}
}
}

I have 2 fields in my doc a:string and b:number
So I was trying various ways to change scoring pattern.

Requirement: I need to have the score based on only coord and not on tf-idf.
Example: If my query is "location":"bangalore" OR "location":"chennai" OR "location":"mumbai"
So If a record contains all the matches, it should come on top, then a document with less number of matches like that.

So the scoring should respect coord only and not tf-idf. How can i achieve this.
From lucene documentation this is the scoring logic:
score(q,d) = coord(q,d) · queryNorm(q) · ∑( tf(t in d)· idf(t)2 · t.getBoost()· norm(t,d))

Hi,
If your list of location is static you can go with a hand made field location_score and you can set points if the location match.
In your case if your document have the location "bangalore" set the location_score to 2.
i.e set the scoring logic before saving your document, so on query you only need to sort.

In my scenario its the number of matches which i am taking into consideration

If my document has location bangalore,chennai,mumbai the there are 3 matches..It should come on the top.

If my document has location bangalore,chennai the there are 2 matches..It should come below the one with 3 matches..
Its like this
Is there a way to do this.

Hi,

I don't know enough about your data, but I can guess if you have documents like this ones:
{"id": 1, "location": ["bangalore", "chennai"], ...}
{"id": 2, "location": ["bangalore", "bollywood"], ...}
{"id": 3, "location": ["bollywood"], ...}

before saving your document you need to check your location and add points so it will give:

{"id": 1, "location": ["bangalore", "chennai"], "location_score": 4} <--- 4 because 2 points by correct location
{"id": 2, "location": ["bangalore", "bollywood"], "location_score": 2} <--- 2 because there's only one good location
{"id": 3, "location": ["bollywood"], "location_score": 0} <--- 0 because bollywood is not in the list of good location

So if you sort by "location_score" you'll get your data in the correct order.

my query is "location":"bangalore" OR "location":"chennai" OR "location":"mumbai"

I can't put a new field in my document (location_score) because the number of match i am saing is based on the query.
In this case how many conditions of the query is matched with the document.

If my document is like:
Doc21: { location :["bangalore","chennai","mumbai","london"] } // for the given query there are 3 matches

for document Doc2:{ location :["bangalore","chennai","berlin","paris"] }// for the given query there are 2 matches

so here the scoring should be based on the number of matches. So Doc1 will come above Doc2.

If my query was "location":"bangalore" OR "location":"chennai" OR "location":"berlin"
here Doc2 will come on top.

This is the requirement.

Hi,

create the docuemnts

``````curl -XPOST 127.0.0.1:9200/test_city/city_score/1 -d '{"id":1, "location":["bangalore", "paris", "mumbai"]}'
curl -XPOST 127.0.0.1:9200/test_city/city_score/2 -d '{"id":2, "location":["bangalore", "paris"]}'
``````

terms query with sort on _score

`curl -XPOST 127.0.0.1:9200/test_city/city_score/_search -d '{"query":{"terms":{"location":["bangalore":mumbai]}}, "sort":"_score"}'`

it return:
`{"took":10,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":2,"max_score":0.581694,"hits":[{"_index":"test_city","_type":"city_score","_id":"1","_score":0.581694, "_source" : {"id":1, "location":["bangalore", "paris", "mumbai"]}},{"_index":"test_city","_type":"city_score","_id":"2","_score":0.09494676, "_source" : {"id":2, "location":["bangalore", "paris"]}}]}}`

Doc 1 have a better score: 0.581694
than doc2 0.09494676

This will not work every time.

Doc1 : "location": [ "bangalore", "paris", "mumbai", "kolkata"]
Doc2 : "location": ["bangalore", "paris","mumbai" ]
Doc3 : "location": ["bangalore", "paris", "mumbai","calicut" ]
Doc4 : "location": [ "bangalore","paris"]
Doc5 : "location": [ "bangalore","paris","bangalore","bangalore"]
Doc6 : location": [ "bangalore", "mumbai", "kolkata"]

I have inserted these data.

Queried for this : {"query":{"terms":{"location":["bangalore","mumbai","calicut","kolkata","paris"] } }, "sort":"_score"}

here the doc with 3 matches came above one with 4 matches.
Here Doc2 came above Doc3.

Result I got:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1,
"hits": [
{
"_index": "test_city",
"_type": "city_score",
"_id": "5",
"_score": 1,
"_source": {
"location": [
"bangalore",
"paris",
"mumbai",
"kolkata"
]
}
},
{
"_index": "test_city",
"_type": "city_score",
"_id": "1",
"_score": 1,
"_source": {
"id": 1,
"location": [
"bangalore",
"paris",
"mumbai"
]
}
},
{
"_index": "test_city",
"_type": "city_score",
"_id": "6",
"_score": 1,
"_source": {
"location": [
"bangalore",
"paris",
"mumbai",
"calicut"
]
}
},
{
"_index": "test_city",
"_type": "city_score",
"_id": "2",
"_score": 1,
"_source": {
"id": 2,
"location": [
"bangalore",
"paris"
]
}
},
{
"_index": "test_city",
"_type": "city_score",
"_id": "7",
"_score": 1,
"_source": {
"location": [
"bangalore",
"paris",
"bangalore",
"bangalore"
]
}
}
]
}
}

There's something strange in your result all your _score are equal to 1???
in my test the score are different : _score":0.581694", "_score":0.09494676!!

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-constant-score-query.html

I have used constant_query which says it'll ignore tf-idf and consider only coord.

I need to ignore this tf-idf some how and consider only coord.