Feature Request: Enforce consistent position_increment_gap in multi-fields for multi-values

Hi ES Community,

I recently found some confusing behavior with a mapping and multi-fields, and it would be nice if Elasticsearch would prevent you from shooting yourself in the foot (like I did).

The issue arises when you use a position_increment_gap in a multi-field inconsistently. Apparently, the position_increment_gap on an outer field is not automatically applied to the inner fields. So, in the example mapping below, the inner unstemmed field will have a different position_increment_gap. When you index multi-value docs on this multi-field, the term vectors for the two fields are inconsistent, and span_near queries will not work across the fields.

      "properties": {
          "text": {
            "analyzer": "main_analyzer", 
            "type": "text", 
            "position_increment_gap": 1000,
            "fields": {
              "unstemmed": {
                "analyzer": "unstemmed_analyzer", 
                "type": "text", 
              }
            }
          }
     }

This type of mistake could have been avoided if one of the following was fixed:

  • Apply the position_increment_gap of a field to any inner multi-fields (note: this may break mappings).
  • Require all multi-fields that use a position_increment_gap to explicitly set it on each field.
  • Emit some sort of warning somewhere.
  • At least acknowledge this gotcha in the docs on position_increment_gap and/or multi-fields.

Thanks for listening!

Feature requests are best put into GitHub :slight_smile:

Okay, posted: [Docs or Feature Request] Enforce consistent position_increment_gap in multi-fields for multi-values · Issue #71659 · elastic/elasticsearch · GitHub

1 Like

Thanks :smiley:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.