Using ELSER for multiple fields

Hello! I've just begun exploring ELSER and I'm interesting in knowing if there is any guidance out there for scenarios whereby I want to run inference on multiple fields, e.g. Title and also Description.

The documentation online (Tutorial: semantic search with ELSER | Elasticsearch Guide [8.9] | Elastic) shows just one text field being used.

My scenario includes both Title and Description fields, whereby I want to apply a boost to the Title field, as such I need both Title and Description to remain separated but also processed via the ELSER model.

Is it possible to have a Document with multiple fields, each of which make use of the ELSER model?

Is there any documentation online or perhaps someone can advise?

Hi,

Thanks for trying ELSER. Regarding your question how to run inference on multiple fields, we don't have an official guidance at this moment. However, you can follow the below steps:

  1. Create the index mapping for the destination index
PUT my-index
{
  "mappings": {
    "properties": {
      "ml_title.tokens": { 
        "type": "rank_features" 
      },
      "ml_description.tokens": { 
        "type": "rank_features" 
      },          
      "title": { 
        "type": "text" 
      },
      "description": { 
        "type": "text" 
      }  
    }
  }
}
  1. Create an ingest pipeline with two inference processors:
PUT _ingest/pipeline/elser-v1-test
{
  "on_failure": [
    {
      "set": {
        "description": "Record error information",
        "field": "error_information",
        "value": "Processor {{ _ingest.on_failure_processor_type }} with tag {{ _ingest.on_failure_processor_tag }} in pipeline {{ _ingest.on_failure_pipeline }} failed with message {{ _ingest.on_failure_message }}"
      }
    }
  ],
  "processors": [
    {
      "inference": {
        "model_id": ".elser_model_1",
        "target_field": "ml_title",
        "field_map": {
          "title": "text_field"
        },
        "inference_config": {
          "text_expansion": {
            "results_field": "tokens"
          }
        }
      }
    },
    {
      "inference": {
        "model_id": ".elser_model_1",
        "target_field": "ml_description",
        "field_map": {
          "description": "text_field"
        },
        "inference_config": {
          "text_expansion": {
            "results_field": "tokens"
          }
        }
      }
    }
  ]
}
  1. Run reindex and ingest the data through the inference ingest pipeline
POST _reindex?wait_for_completion=false
{
  "source": {
    "index": "test-data",
    "size": 50 
  },
  "dest": {
    "index": "my-index",
    "pipeline": "elser-v1-test"
  }
}
  1. After reindex is complete, you can use bool query to perform semantic search on both fields:
GET my-index/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "text_expansion": {
            "ml_title.tokens": {
              "model_id": ".elser_model_1",
              "model_text": "Some text"
            }
          }
        },
        {
          "text_expansion": {
            "ml_description.tokens": {
              "model_id": ".elser_model_1",
              "model_text": "Some text"
            }
          }
        }
      ]
    }
  }
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.