Hi Team
I'm trying to understand which similarity algorithm is best suited for our use-case:
1.TF-IDF(default)
2.BM25
I've created two indices with mappings of both algorithms, but when I query, two indices are giving same documents with same score.
Mappings for Index 1:(default algorithm)
{ "singleindex": { "aliases": {}, "mappings": { "people": { "properties": { "Application-Name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "Author": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "Character Count": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "Content-Type": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "Creation-Date": { "type": "date" }, "Last-Modified": { "type": "date" }, "Page-Count": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "Word-Count": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "content": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "meta:last-author": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "tika": { "properties": { "mime": { "properties": { "file": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } } }, "xmpTPg:NPages": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } } } } }, "settings": { "index": { "creation_date": "1539597307747", "number_of_shards": "5", "number_of_replicas": "1", "uuid": "9jzKHct4T3qjI_EfFPFnzg", "version": { "created": "6020399" }, "provided_name": "singleindex" } } } }
Mappings for Index 2:(BM25)
{
"tes_index": {
"aliases": {},
"mappings": {
"people": {
"properties": {
"Application-Name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"similarity": "BM25"
}
}
},
"Character Count": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Content-Type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Creation-Date": {
"type": "date"
},
"Last-Modified": {
"type": "date"
},
"Page-Count": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Word-Count": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"content": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"meta:last-author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"xmpTPg:NPages": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
},
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "tes_index",
"similarity": {
"default": {
"type": "BM25"
}
},
"creation_date": "1539599395035",
"number_of_replicas": "1",
"uuid": "BFDAL1XuQ1O36KLIMDT2mw",
"version": {
"created": "6020399"
}
}
}
}
}
You can see in the settings. Please suggest what I'm missing here