Error loading model to ElasticSearch

Hi,

I previously successfully uploaded a NER model from Huggingface to Elasticsearch (8.10) with Eland.
Now I'm trying to load a Text Classification and I'm getting an error:

docker run -it --rm --network host
docker.elastic.co/eland/eland
eland_import_hub_model
--url my elastic host and port/
--hub-model-id MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
--task-type zero_shot_classification
--clear-previous

The exception I'm getting:

Traceback (most recent call last):
  File "/usr/local/bin/eland_import_hub_model", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/dist-packages/eland/cli/eland_import_hub_model.py", line 235, in main
    tm = TransformerModel(
  File "/usr/local/lib/python3.9/dist-packages/eland/ml/pytorch/transformers.py", line 624, in __init__
    raise TypeError(
TypeError: Tokenizer type PreTrainedTokenizer(name_or_path='sileod/mdeberta-v3-base-tasksource-nli', vocab_size=250101, model_max_len=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '[CLS]', 'eos_token': '[SEP]', 'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}) not supported, must be one of: <class 'transformers.models.bart.tokenization_bart.BartTokenizer'>, <class 'transformers.models.bert.tokenization_bert.BertTokenizer'>, <class 'transformers.models.bert_japanese.tokenization_bert_japanese.BertJapaneseTokenizer'>, <class 'transformers.models.distilbert.tokenization_distilbert.DistilBertTokenizer'>, <class 'transformers.models.dpr.tokenization_dpr.DPRContextEncoderTokenizer'>, <class 'transformers.models.dpr.tokenization_dpr.DPRQuestionEncoderTokenizer'>, <class 'transformers.models.electra.tokenization_electra.ElectraTokenizer'>, <class 'transformers.models.mobilebert.tokenization_mobilebert.MobileBertTokenizer'>, <class 'transformers.models.mpnet.tokenization_mpnet.MPNetTokenizer'>, <class 'transformers.models.retribert.tokenization_retribert.RetriBertTokenizer'>, <class 'transformers.models.roberta.tokenization_roberta.RobertaTokenizer'>, <class 'transformers.models.squeezebert.tokenization_squeezebert.SqueezeBertTokenizer'>, <class 'transformers.models.xlm_roberta.tokenization_xlm_roberta.XLMRobertaTokenizer'>

Does anyone have any clues?

Thanks in advance.

Hi Rodrigo!
Welcome to the community!

The error message says that the particular model you are trying to import is not yet implemented for Eland. You can see the list of supported third party models here: Compatible third party NLP models | Machine Learning in the Elastic Stack [8.10] | Elastic
Currently, these are listed for text classification:

Third party text classification models

Perhaps one of these would also work for the use case you have in mind, then it would be easiest to implement with the code you already have set up.
Hope this helps!

Cheers,
Iulia

Hi Iulia. Thank you very much.

You're welcome! Good luck with the project :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.