The E5 multilingual optimized returns a python error. (self hosted?)

Hi,

I am trying to run the E5 multilingual optimized model myself, as I feel this might be faster than running it through elastic that is in a docker container.

I am running into the following issue:

When loading in the model with the following python line:
model = SentenceTransformer('elastic/multilingual-e5-small-optimized')

I get the following error:

Loading model...
No sentence-transformers model found with name elastic/multilingual-e5-small-optimized. Creating a new one with MEAN pooling.
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 672/672 [00:00<00:00, 2.24MB/s]
pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 412M/412M [00:36<00:00, 11.2MB/s]
Traceback (most recent call last):
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 532, in load_state_dict
    return torch.load(
           ^^^^^^^^^^^
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/torch/serialization.py", line 1025, in load
    raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
_pickle.UnpicklingError: Weights only load failed. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution.Do it only if you get the file from a trusted source. WeightsUnpickler error: Unsupported class torch.qint8

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 541, in load_state_dict
    if f.read(7) == "version":
       ^^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 128: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/main.py", line 54, in <module>
    main()
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/main.py", line 24, in main
    model = SentenceTransformer('elastic/multilingual-e5-small-optimized')
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 199, in __init__
    modules = self._load_auto_model(
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 1134, in _load_auto_model
    transformer_model = Transformer(
                        ^^^^^^^^^^^^
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/sentence_transformers/models/Transformer.py", line 36, in __init__
    self._load_model(model_name_or_path, config, cache_dir, **model_args)
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/sentence_transformers/models/Transformer.py", line 65, in _load_model
    self.auto_model = AutoModel.from_pretrained(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3335, in from_pretrained
    state_dict = load_state_dict(resolved_archive_file)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/chenko/Documents/ELSER/e5_sentence_run/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 553, in load_state_dict
    raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for '/Users/chenko/.cache/huggingface/hub/models--elastic--multilingual-e5-small-optimized/snapshots/2612fa238fbd4d6348b6aef85906b3d3d3a8fec3/pytorch_model.bin' at '/Users/chenko/.cache/huggingface/hub/models--elastic--multilingual-e5-small-optimized/snapshots/2612fa238fbd4d6348b6aef85906b3d3d3a8fec3/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

As can be seen it does download the model, however it says the following:

No sentence-transformers model found with name elastic/multilingual-e5-small-optimized. Creating a new one with MEAN pooling.

However when trying this with the original e5 model: intfloat/multilingual-e5-small it works perfectly fine.

Any help would be appreciated!

Kind regards, Chenko

Hello @Chenko,

Our optimized version of multilingual-e5-small is a quantized and traced version. It won't be loaded with SentenceTransformer. It is supposed to be used with Elasticsearch in an air-gapped environment.

Regards,
Valeriy

Thanks @valeriy42 ,

How would it possible to run this model outside of elastic?
If at all?

Kind regards,
Chenko