Importing Hugging Face Model into my Desktop

Hi All,
Im trying to follow the code for chunking and indexing large documents . My problem right now is that I dont know how to import the semantic_search model as specified in the ingest pipeline processor :

    {
      "foreach": {
        "field": "passages",
        "processor": {
          "inference": {
            "field_map": {
              "_ingest._value.text": "text_field"
            },
            "model_id": "sentence-transformers__all-minilm-l6-v2",
            "target_field": "_ingest._value.vector",
            "on_failure": [
              {
                "append": {
                  "field": "_source._ingest.inference_errors",
                  "value": [
                    {
                      "message": "Processor 'inference' in pipeline 'ml-inference-title-vector' failed with message '{{ _ingest.on_failure_message }}'",
                      "pipeline": "ml-inference-title-vector",
                      "timestamp": "{{{ _ingest.timestamp }}}"
                    }
                  ]
                }
              }
            ]
          }
        }
      }
    }

I could not find clear tutorial on how to do that beginning to end. I tried to use some code I found by googling like this:

from eland.ml.pytorch import import_hub_model 

# Replace with your Elasticsearch connection details and model information
es_url =    "http://localhost:9200" # Or use cloud_id
hub_model_id = "sentence-transformers/all-MiniLM-L6-v2"
task_type = "text_embedding"
es_index_name = "first_index" 
model_id = "my_sentence_model"

try:
    import_hub_model(
        es_url,
        hub_model_id,
        task_type,
        es_index=es_index_name,
        model_id=model_id,
    )
    print(f"Model '{hub_model_id}' imported successfully as '{model_id}'.")

except Exception as e:
     print(f"Error importing model: {e}")

But this code doesnt work as it cant find the import_hub_model .

I have downloaded version 7.17.26 on my windows 11 machine . I could not install 8+ version because I was getting an error trying to start the ES instance but that is another problem.

I appreciate if someone can guide me on this.
Thanks
Samer

You should really get 8.17 up and running. The process of embedding documents using the new semantic_text [1] and the inference endpoint [2] is the way to go. Much cleaner.

[1] -Semantic text field type | Elasticsearch Guide [8.17] | Elastic
[2] - Create inference API | Elasticsearch Guide [8.17] | Elastic

1 Like