Prevent Keyword Stuffing

I'm using ES 7.3. I have a product index with titles, part, model numbers, etc. I am having issues with keyword stuffing because some of the products might contain the same word, part or model number multiple times throughout the document.

For example, the model number may be present twice in the title and also in the model number field. Some products may only include the model number in the title and not in the model number fields. These products have difficulty ranking due to the issue. How can I prevent this type of keyword stuffing? Here is my code.

UPDATE:
Adding the unique filter does prevent duplicate values matching on the same field but does not across multiple fields which is what I need.

Fields:

fields = [
          'name^10','name.ngram',
          'part_number^10',


          'mod_name^5', 

          'model_number^5', 

          'brand^10',
          'category^5',
          'product_type^5',
          'search_variations^1'                  
         ]  

fuzzy_fields = [
         'name',
          'part_number',
          'mod_name', 
          'model_number', 
          'brand',
          'category',
          'product_type',
          'search_variations'                  
         ]

Query:

{
         explain: true,
         query:{
           function_score: { 
              "query": {
                "bool": {
                  "should": 
                    [{
                      multi_match:{
                         fields: fields,
                         type: "most_fields", 
                         query: "#{query}"
                       }
                    },
                    {
                      multi_match:{
                         fields: fuzzy_fields,
                         type: "most_fields", 
                         fuzziness: "AUTO",
                         query: "#{query}"
                       }
                    }],
                  "filter": {
                    "bool": { 
                      "must": filters
                    }
                  }
               }
             },field_value_factor:{
                    field: "popularity",
                    modifier: "log1p",
                    factor: 5

                 },
                 boost_mode: "sum"
             }
       },highlight: {
            fields: {
              :"*" => {}
            }
          },
        aggs: {categories: { terms: { field: "category.raw"} }} 

      }

Hi Cannon

Why then opt for the most_fields mode?
Are best_fields or cross_fields not more suited?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.