Elastic Search Similarity Not working

Hi

Elastic Search version : 1.7.2
I tried giving the custom similarity in both setting and mappings.First I copied the plugin in the ES_HOME/plugin diredtory (the jar file). Then I unzipped it in the same folder.
Then I posted this in the myIndex/_settings. But even after this the score was not at all changing.It was the same as the previous. I had ovveriden tf, idf, coord methids and gave my values.But the score didnt change.
{
"myIndex": {
"settings": {
"index": {
"creation_date": "1467945023081",
"number_of_shards": "5",
"uuid": "Ja_vqHohQjSZVnSKxgLBSw",
"version": {
"created": "1070299"
},
"number_of_replicas": "1",
"similarity": {
"index": {
"type": "stefansavev.esplugins.OverlapSimilarityProvider"
},
"search": {
"type": "stefansavev.esplugins.OverlapSimilarityProvider"
}
}
}
}
}
}

Thanks
Nisar

Any body has an idea on the above question.

Can you post the code for your similarity please and all the exact steps you did to configure it?

Thanks

I directly ran it from postman..
Below are the setting i have added.
GET http://localhost:9200/ppp6 --->>This gives the below output

{
"ppp6": {
"aliases": {},
"mappings": {
"doc": {
"properties": {
"body": {
"type": "string",
"similarity": "overlapsimilarity"
},
"index": {
"type": "string",
"similarity": "overlapsimilarity"
}
}
},
"er": {
"properties": {
"index": {
"type": "string"
},
"search": {
"type": "string"
}
}
}
},
"settings": {
"index": {
"creation_date": "1468298653590",
"number_of_shards": "5",
"uuid": "XorHvK2RRrKft_rnaJYMrg",
"version": {
"created": "1070299"
},
"number_of_replicas": "1",
"similarity": {
"overlapsimilarity": {
"type": "stefansavev.esplugins.OverlapSimilarityProvider",
"b": "0"
}
}
}
},
"warmers": {}
}
}

I have placed the jar from the below given git hub project in the ES_HOME/plugins directory https://github.com/stefansavev/elasticsearch-custom-similarity-example

Below are the data i have indexed.
GET http://localhost:9200/ppp6/_search

{
"took": 33,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 9,
"max_score": 1,
"hits": [
{
"_index": "ppp6",
"_type": "er",
"_id": "AVXdbGIQH55KQhunBScf",
"_score": 1,
"_source": {
"index": "abcgdfggas99",
"search": "12145444"
}
},
{
"_index": "ppp6",
"_type": "kk",
"_id": "4",
"_score": 1,
"_source": {
"index": "zzz abc zzz",
"body": "2323"
}
},
{
"_index": "ppp6",
"_type": "er",
"_id": "AVXdbIGCH55KQhunBScg",
"_score": 1,
"_source": {
"index": "xcvcx abc ccgdfggas99",
"search": "12145444"
}
},
{
"_index": "ppp6",
"_type": "kk",
"_id": "1",
"_score": 1,
"_source": {
"index": "ABC D",
"body": "77"
}
},
{
"_index": "ppp6",
"_type": "er",
"_id": "AVXdbNADH55KQhunBSci",
"_score": 1,
"_source": {
"index": "xcvcxabcgdfggas99",
"search": "56566"
}
},
{
"_index": "ppp6",
"_type": "er",
"_id": "AVXdbP4oH55KQhunBScj",
"_score": 1,
"_source": {
"index": "abc"
}
},
{
"_index": "ppp6",
"_type": "kk",
"_id": "2",
"_score": 1,
"_source": {
"index": "RRFRF ABC DDFDF",
"body": "11111"
}
},
{
"_index": "ppp6",
"_type": "er",
"_id": "AVXdbJ3BH55KQhunBSch",
"_score": 1,
"_source": {
"index": "abc asadfd dfdf dfsdfsfv sesfss",
"search": "12145444"
}
},
{
"_index": "ppp6",
"_type": "kk",
"_id": "3",
"_score": 1,
"_source": {
"index": "index abc pdfodp sfsd",
"body": "6546"
}
}
]
}
}

Then I searched for this
{"query":
{ "match":
{ "index":"abc"
}
}
}

The result was the same with this similarity settings and without this settings.The score was the same in both the cases.

Would you please check. Is this expected. I was expecting different scores in both the cases.
Or else can you give a valid example for us to test.

What you posted is the situation after you did all the steps. I am missing how you setup your mapping etc. but I can see that you get back some documents from type er and its fields are not picking up the custom similarity. Also, can you post your query? Which fields are you querying?

The document i am indexing has two fields : field1: index and field2:body
Sample Doc: {
"index": "ABC D",
"body": "77"
}

I am quertying on the field 'index'.. 'index' is my fieldname..dont get confused with elastic search index.

POST http://localhost:9200/ppp6/_search
{"query":
{ "match":
{ "index":"abc"
}
}
}

Ok, as I said, I see two types. docs and er. The former has the similarity properly setup on its index field, while er doesn't it seems. I would double check how you set up your mappings.

Indexed all the docs for index:ppp6 & type:doc

did the query
POST http://localhost:9200/ppp6/doc/_search
{"query":{"match":{"index":"abc" } } }

below error came:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 4,
"failed": 1,
"failures": [
{
"index": "ppp6",
"shard": 1,
"status": 500,
"reason": "QueryPhaseExecutionException[[ppp6][1]: query[filtered(index:abc)->cache(_type:doc)],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: AbstractMethodError; "
}
]
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}

Not sure about this error, it may just mean that you similarity kicked in but causes problems.

There are some default similarities which we can try right. like BM25, DFR, IB etc..
But all these gave the same score. Is it expected.

I need a custom similarity where in which only coord is taken into consideration. The TF-IDF should be ignored.