custom score on the basis of a field value(colon separated) and booting factor
e.g I have a indexed a field as
scorevector : "Lucene:1.3 Hadoop:4.3 Elastic:6.8 Lucene:7.2"
I would like to search the index as
my_index/_search
get_custom_score : {
scorevector : "Lucene^5 Hadoop^2 Elastic^3 HDFS^1.5"
}
score must be returned as [(1.3+7.2)/2]*5 + 4.3*2 + 6.8*3 = 50.25
term "Lucene" payload average [(1.3+7.2)/2]
I tried few things as below
POST my_index
{
"settings": {
"analysis": {
"analyzer": {
"payloads": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"my_delimited_payload"
]
}
},
"filter": {
"my_delimited_payload": {
"type": "delimited_payload",
"delimiter": ":",
"encoding": "float"
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"scorevector": {
"norms": true,
"index": true,
"store": false,
"type": "text",
"analyzer": "payloads"
}
}
}
}
}
PUT my_index
{scorevector : "Lucene:1.3 Hadoop:4.3 Elastic:6.8 HDFS:7.2"}
Now as per my understanding this payload (float value next to colon) should be analyzed.
I think I can make use of AveragePayloadFunction.
I need some solution to get the score as needed.
Could you please help me to with some example code?
Any help will be appreacciated.
Thanks in advance
Amit
I could resolve it by making two plugins as below
MySimilarityPlugin.java : extends Plugin
@Override
public void onIndexModule(IndexModule im) {
im.addSimilarity(..., new MySimilarity(...))
}
MySimilarity extends ClassicSimilarity and all method returns as 1.
MyQueryParserPlugin.java : extends Plugin implements SearchPlugin
Override getQueries(..MyQueryBuilder...)
MyQueryBuilder is same as QueryStringQueryBuilder use MyQueryParser instead QueryStringQueryParser.
MyQueryParser extends QueryStringQueryParser
@Override
public Query getFieldQuery(String f, String qt, boolean q) {
return new PayloadScoreQuery(new SpanTermQuery(new Term(field, queryText)), new AveragePayloadFunction());
}
'test' index as below
curl -XPUT 'localhost:9200/test' -d ' {
"settings": {
"index": {
"similarity": {
"mysim": {
"type": "mysimilarity"
}
},
"analysis": {
"analyzer": {
"myanalyzer": {
"filter": [
"mytokenfilter"
],
"type": "custom",
"tokenizer": "whitespace"
}
},
"filter": {
"mytokenfilter": {
"type": "delimited_payload",
"delimiter": ":"
}
}
}
}
},
"mapping": {
"docs": {
"properties": {
"mytext": {
"type": "text",
"similarity": "mysim",
"analyzer": "myanalyzer"
}
}
}
}
}'
Its working as expected.
Please let me know, if you have any better approach.
system
(system)
Closed
September 28, 2018, 4:05pm
3
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.