Query and score similar documents based on hierarchical data

I'm working in creating and simplifying probabilistic topic models for large corpora of data. Every document I would index will contain a field having the following format:

...
"topics" : {
"level0" : ["keywords"],
"level1" : ["keywords"],
"level2" : ["keywords"]
}
...

I would want to make a query that, given a document id (let us call this doc D), would give me the documents (let us call all possible hits H) which are similar based on this field. In order to get a match, one of all level keywords from D has to be present in any of the levels. Then, the score of each hit should be higher if the keyword they share is at a lower level in D. It's should get higher if the keyword they share is at a lower level in H.

I'm currently using the following query

"query": {
	"bool" : {
		"should" : [
	{
		"multi_match" : {
			"boost" : 3,
			"query":    "Keyword_A", 
			"fields": [ "topics.l0", "topics.l1", "topics.l2" ] 
	} },
	{
		"multi_match" : {
			"boost": 2,
			"query":    "Keyword_B", 
			"fields": [ "topics.l0", "topics.l1", "topics.l2" ] 
	} },
	{
		"multi_match" : {
		"query":    "Keyword_C", 
		"fields": [ "topics.l0", "topics.l1", "topics.l2" ] 
	}
	}
			
		]
	}
	
} #for now, Keywords_[A, B, C] are taken from **D** manually as I don't know how to fetch this fields directly into a query

In combination with index boost in each of the field.

Is there a better way for me to define this query or the score?

Thanks in advance

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.