Custom score on the basis of a field value(colon separated) and boosting factor in ES 6.2.4

amit.nema · August 17, 2018, 11:15am

custom score on the basis of a field value(colon separated) and booting factor

e.g I have a indexed a field as
scorevector : "Lucene:1.3 Hadoop:4.3 Elastic:6.8 Lucene:7.2"

I would like to search the index as

my_index/_search
get_custom_score : {
scorevector : "Lucene^5 Hadoop^2 Elastic^3 HDFS^1.5"
}

score must be returned as [(1.3+7.2)/2]*5 + 4.3*2 + 6.8*3 = 50.25
term "Lucene" payload average [(1.3+7.2)/2]

I tried few things as below

POST my_index
{
"settings": {
"analysis": {
"analyzer": {
"payloads": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"my_delimited_payload"
]
}
},
"filter": {
"my_delimited_payload": {
"type": "delimited_payload",
"delimiter": ":",
"encoding": "float"
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"scorevector": {
"norms": true,
"index": true,
"store": false,
"type": "text",
"analyzer": "payloads"
}
}
}
}
}

PUT my_index
{scorevector : "Lucene:1.3 Hadoop:4.3 Elastic:6.8 HDFS:7.2"}

Now as per my understanding this payload (float value next to colon) should be analyzed.

I think I can make use of AveragePayloadFunction.

I need some solution to get the score as needed.

Could you please help me to with some example code?

Any help will be appreacciated.

Thanks in advance
Amit

amit.nema · August 31, 2018, 4:05pm

I could resolve it by making two plugins as below

MySimilarityPlugin.java : extends Plugin
@Override
public void onIndexModule(IndexModule im) {
im.addSimilarity(..., new MySimilarity(...))
}
MySimilarity extends ClassicSimilarity and all method returns as 1.
MyQueryParserPlugin.java : extends Plugin implements SearchPlugin

Override getQueries(..MyQueryBuilder...)
MyQueryBuilder is same as QueryStringQueryBuilder use MyQueryParser instead QueryStringQueryParser.
MyQueryParser extends QueryStringQueryParser
@Override
public Query getFieldQuery(String f, String qt, boolean q) {
return new PayloadScoreQuery(new SpanTermQuery(new Term(field, queryText)), new AveragePayloadFunction());
}

'test' index as below
curl -XPUT 'localhost:9200/test' -d ' {
"settings": {
"index": {
"similarity": {
"mysim": {
"type": "mysimilarity"
}
},
"analysis": {
"analyzer": {
"myanalyzer": {
"filter": [
"mytokenfilter"
],
"type": "custom",
"tokenizer": "whitespace"
}
},
"filter": {
"mytokenfilter": {
"type": "delimited_payload",
"delimiter": ":"
}
}
}
}
},
"mapping": {
"docs": {
"properties": {
"mytext": {
"type": "text",
"similarity": "mysim",
"analyzer": "myanalyzer"
}
}
}
}
}'

Its working as expected.

Please let me know, if you have any better approach.

system · September 28, 2018, 4:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Custom Scoring if term found in specific field Elasticsearch	4	373	July 6, 2017
CustomScoreQuery in ElasticSearch Elasticsearch	3	496	July 6, 2017
Need some help with Custom Score Query Elasticsearch	7	389	July 6, 2017
Custom Score Query and non-numeric field values Elasticsearch	5	1405	July 6, 2017
Payload scoring in elasticsearch? Elasticsearch	2	477	March 22, 2022

Custom score on the basis of a field value(colon separated) and boosting factor in ES 6.2.4

Related topics