Elasticsearch with NEST: How can I score matches of multiple keywords much higher than single matches?

Hi there,

Our users often type in a list of keywords, hoping for a good match from within the "Title" field, the primary field in many of our indices. Using the multimatch query, we get decent results for single keywords. I've also added a phrase match to capture when all of the keywords match.

For example, if a user types in "old stats code", it will score highly with all of the keywords matching, but if it gets just two out of the three, it doesn't get a much higher score than a match of just one keyword.

I would like to improve scoring so that 4 matches > 3 matches > 2 matches > 1 match. And if one of the words in the search is not in the document, it doesn't exclude that. (We have quoted search to hit the exact match with phrase search)

I'm experimenting with multiple queries to get the right results, so my code is a bit messy here.

Some context for the code: We have filters to exclude results. That works well. I added the CrossField query because sometimes one keyword is in the title and another in some random field or attachment. I'm boosting the primary and secondary fields a lot, because they are much more important for most users.

var allFields = new Field("_primarySearch", PRIMARYBOOST).And("_secondarySearch", SECONDARYBOOST).And("_regularSearch", REGULARBOOST).And("attachment.content");`

    QueryContainer siteFilterQuery = new MatchQuery()
                {
                    Field = "parentSiteID",
                    Query = filter.Site.ToString()
                };

                QueryContainer authorFilterQuery = new MatchQuery()
                {
                    Field = "userID",
                    Query = filter.Author.ToString()
                };

                QueryContainer multiMatchPhraseQuery = new MultiMatchQuery()
                {
                    Fields = allFields,
                    Type = QueryType,
                    Query = query
                };

                QueryContainer multiMatchCrossFieldQuery = new MultiMatchQuery()
                {
                    Fields = allFields,
                    Type = TextQueryType.CrossFields,
                    Query = query
                };

                QueryContainer multiMatchBestFieldsAndQuery = new MultiMatchQuery()
                {
                    Fields = allFields,
                    Type = TextQueryType.BestFields,
                    Query = query,
                    Operator = Operator.And,
                    Boost=10
                };

                QueryContainer multiMatchBestFieldsQuery = new MultiMatchQuery()
                {
                    Fields = allFields,
                    Type = TextQueryType.BestFields,
                    Query = query,
                };

                QueryContainer dateScoreQuery = new FunctionScoreQuery()
                {
                    Functions = new List<IScoreFunction>
                            {
                                new ExponentialDateDecayFunction
                                {
                                    Field= "timestamp",
                                    Origin= DateTime.Now,
                                    Decay= decay,
                                    Scale= scale,
                                    Offset= offset
                                }
                            },
                    BoostMode = FunctionBoostMode.Multiply,
                    Boost = DECAYBOOST

                };

                return EsClient.Search<dynamic>(s => s
                        .AllIndices()
                        .AllTypes()
                        .Query(q =>
                        {
                            QueryContainer cont = new QueryContainer();
                            if (filter.Site != null)
                            {
                                cont &= siteFilterQuery;
                            }
                            if (filter.Author != null)
                            {
                                cont &= authorFilterQuery;
                            }
                            if (filter.Tags != null && filter.Tags.Length > 0)
                            {
                                cont &= new MatchQuery()
                                {
                                    Field = "tags",
                                    Query = string.Join(" ", filter.Tags)
                                };
                                
                            }

                            if (QueryType == TextQueryType.Phrase)
                            {
                                cont &= multiMatchPhraseQuery;
                            }
                            else
                            {
                                cont &= multiMatchBestFieldsQuery;
                                cont &= multiMatchCrossFieldQuery;
                                cont |= multiMatchBestFieldsAndQuery;
                            }
                            
                            cont &= dateScoreQuery;
                            return cont;
                        })
                        .Highlight(h =>
                        h.Fields(f =>
                            f.Field("*").PreTags("<b>").PostTags("</b>")
                            )
                        )
                        .From(from)
                        .Size(pageSize)
                 );

Thank you!

I found an article on StackOverflow by @forloop .

I'm going to try to restructure my code a bit around Object Initializer syntax instead. I hope I am going down the right path.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.