I am trying to upload a ML model to Elasticsearch through the Eland client but there seems to be some issues. The model I am trying to upload is listed as a Compatible third party model in the documentation here:
Model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2 · Hugging Face
Eland command:
eland_import_hub_model
-u elastic
-p XXXXXXXX
--cloud-id XXXXXXXX
--hub-model-id sentence-transformers/paraphrase-multilingual-mpnet-base-v2
--task-type text_embedding
--start
Error message:
2022-12-08 15:13:57,700 INFO : Loading HuggingFace transformer tokenizer and model 'sentence-transformers/paraphrase-multilingual-mpnet-base-v2'
Traceback (most recent call last):
File "/usr/local/bin/eland_import_hub_model", line 197, in <module>
tm = TransformerModel(args.hub_model_id, args.task_type, args.quantize)
File "/usr/local/lib/python3.9/dist-packages/eland/ml/pytorch/transformers.py", line 547, in __init__
raise TypeError(
TypeError: Tokenizer type PreTrainedTokenizer(name_or_path='sentence-transformers/paraphrase-multilingual-mpnet-base-v2', vocab_size=250002, model_max_len=512, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '<unk>', 'sep_token': '</s>', 'pad_token': '<pad>', 'cls_token': '<s>', 'mask_token': AddedToken("<mask>", rstrip=False, lstrip=True, single_word=False, normalized=False)}) not supported, must be one of: <class 'transformers.models.bart.tokenization_bart.BartTokenizer'>, <class 'transformers.models.bert.tokenization_bert.BertTokenizer'>, <class 'transformers.models.distilbert.tokenization_distilbert.DistilBertTokenizer'>, <class 'transformers.models.dpr.tokenization_dpr.DPRContextEncoderTokenizer'>, <class 'transformers.models.dpr.tokenization_dpr.DPRQuestionEncoderTokenizer'>, <class 'transformers.models.electra.tokenization_electra.ElectraTokenizer'>, <class 'transformers.models.mobilebert.tokenization_mobilebert.MobileBertTokenizer'>, <class 'transformers.models.mpnet.tokenization_mpnet.MPNetTokenizer'>, <class 'transformers.models.retribert.tokenization_retribert.RetriBertTokenizer'>, <class 'transformers.models.roberta.tokenization_roberta.RobertaTokenizer'>, <class 'transformers.models.squeezebert.tokenization_squeezebert.SqueezeBertTokenizer'>
Any idea's whats causing this? I am able to upload other models from the compatible model list without issues.
Thanks!