Searching by Ranked information

Elastic has the ability to do a ranked search but I cannot see that I can store the information in the document and then return the documents based on the rank.

This an example of how I would store the information in a field (but I am open to anything) but I'm not sure if/how to query the engine to sort based on Topic A's rank.

[{"Topic A":.5}, ("Topic B":.855}, {"Topic C":.255}, {"Topic D":.641}, {"Topic E":.500}, {"Topic F":.541}, {"Topic G":.523}]

Has anyone done this or something similar?

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-rank-feature-query.html

@David_Williams1To be honest, I'm not actually sure what you're asking.

I think there are a couple of options for boosting scores based on field values. Since you have numeric values, perhaps a "functional boost" is what you're looking for? https://swiftype.com/documentation/app-search/guides/relevance-tuning#boosts

I think you'll need to make "topic_a", "topic_b", etc. discrete document fields though.

Take a look at the doc I linked and let me us know if that helps.

Thank you for your reply.

The relevance tuning won't work since I can't make each topic a discrete field as there are 100,000's of entries.

Another way to look at what I want to do is say if I have 4 documents that discuss water quality in Lake Ontario and each of them has 'water quality' as a topic but each has a different weighted value. I want to return those documents in order of the weighted value.

I would also want to return only the documents where the weighted value is over a certain value.

In any relational DB I could do this easily so I'm thinking there has got to be a way in Elastic.

David

@David_Williams1 Yeah, unfortunately we don't have a great solution for nested fields like this right now.

The only way I think you'll get this working is with duplicate documents. Probably some setup where you have the same document duplicated with the various topic values.

That would let you filter by "topic" and sort by "topic_value".

You could also leverage the grouping feature when necessary to avoid duplication in results.

There's a number of drawbacks with this approach. For instance, I don't think curations will work with grouping, and the counts provided in facets when using grouping are not always what you would expect.

There's also a maintenance burden associated with this, trying to maintain multiple variations of a single document.

[{
  name: "Yellowstone park",
  topic: "Topic A",
  topic_value: "1.5"
}, {
  name: "Yellowstone park",
  topic: "Topic B",
  topic_value: "2.0"
}]

Probably not the answer you were hoping for, hopefully this helps though.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.