Let's say that every document has multiple values, the values sort of probabilistic-ally define the document.
So if a document has many color
values of red
and few of white
, I would like to get documents based on how red they are.
Now I also want to support synonyms. So, for example, I might say that velvet is a synonym for red, and the "red-ness" of a document would be also its "velvet-ness"
Now, I tried two approaches to these and I got stuck on each one.
Approach 1 : Multisets
Pros: Can use a field value factor of the field "color."
Cons: This won't work for synonyms unless I expand them pre-query
{
"color":{
"red":0.75,
"white":0.25
}
}
Approach 2: Key Value
Pros: Synonyms for color.value will work
Cons: there's an ambiguous color.rank, which weight to use? 0.25 or 0.75? It's the same field
{
"color":[
{"value":"red","weight":0.75},
{"value":"white","weight":0.25}
]
}
Is there another approach for this? Thanks