I have documents with tags and i have weights for each tag. There are more than 40 million documents in the index. I would like to boost the documents according to the tag weight. I can think of two ways to implement this idea.
- index time: repeat the tags proportional to the weight in the index
- search time: store tag string and weight as a nested document and use a function scoring query to boost the documents with matching tags by its weight.
Though I like 2nd approach for its flexibility and smaller index size, performance is always a crucial factor in search. I feel that first approach would be better in terms of performance though it looks like a hacked solution.
I am planning to do some benchmark testing myself, but I am curious to learn how others with similar use case handled the problem.