I've installed Eland and have imported a model from huggingface as follows:
eland_import_hub_model --url http://localhost:9200/ \
--hub-model-id HooshvareLab/bert-fa-zwnj-base-ner \
--task-type ner
I can see it in http://localhost:5601/app/ml/trained_models but when I try to start it either by using start icon or by this api POST _ml/trained_models/hooshvarelab__bert-fa-zwnj-base-ner/deployment/_start I get this error:
{
"error" : {
"root_cause" : [
{
"type" : "status_exception",
"reason" : "Could not start trained model deployment, the following nodes failed with errors [{f5IiQjhhTs6-ewaKpQ8xFg=Validation Failed: 1: classification label [B_DAT] is not an entity I-O-B tag.;2: classification label [B_EVE] is not an entity I-O-B tag.;3: classification label [B_FAC] is not an entity I-O-B tag.;4: classification label [B_MON] is not an entity I-O-B tag.;5: classification label [B_PCT] is not an entity I-O-B tag.;6: classification label [B_PRO] is not an entity I-O-B tag.;7: classification label [B_TIM] is not an entity I-O-B tag.;8: classification label [I_DAT] is not an entity I-O-B tag.;9: classification label [I_EVE] is not an entity I-O-B tag.;10: classification label [I_FAC] is not an entity I-O-B tag.;11: classification label [I_MON] is not an entity I-O-B tag.;12: classification label [I_PCT] is not an entity I-O-B tag.;13: classification label [I_PRO] is not an entity I-O-B tag.;14: classification label [I_TIM] is not an entity I-O-B tag.;15: Valid entity I-O-B tags are [O, B_MISC, I_MISC, B_PER, I_PER, B_ORG, I_ORG, B_LOC, I_LOC];}]"
}
],
"type" : "status_exception",
"reason" : "Could not start trained model deployment, the following nodes failed with errors [{f5IiQjhhTs6-ewaKpQ8xFg=Validation Failed: 1: classification label [B_DAT] is not an entity I-O-B tag.;2: classification label [B_EVE] is not an entity I-O-B tag.;3: classification label [B_FAC] is not an entity I-O-B tag.;4: classification label [B_MON] is not an entity I-O-B tag.;5: classification label [B_PCT] is not an entity I-O-B tag.;6: classification label [B_PRO] is not an entity I-O-B tag.;7: classification label [B_TIM] is not an entity I-O-B tag.;8: classification label [I_DAT] is not an entity I-O-B tag.;9: classification label [I_EVE] is not an entity I-O-B tag.;10: classification label [I_FAC] is not an entity I-O-B tag.;11: classification label [I_MON] is not an entity I-O-B tag.;12: classification label [I_PCT] is not an entity I-O-B tag.;13: classification label [I_PRO] is not an entity I-O-B tag.;14: classification label [I_TIM] is not an entity I-O-B tag.;15: Valid entity I-O-B tags are [O, B_MISC, I_MISC, B_PER, I_PER, B_ORG, I_ORG, B_LOC, I_LOC];}]"
},
"status" : 500
}
Why I get this error?
p.s.
Among almost 10 models (from huggingface hub) that I've tried to deploy/import and start, I could just get 3 of them working (including two elastic's models and dslim/bert-base-NER-uncased)! Others failed in importing time by eland or in start time in elastic/kibana!
I've noticed the models must be BERT or other generations of it
That page also recommends 3 models you can use for the Named Entity Recognition, I'm guessing those are the ones you had success with.
The error you see is because the I-O-B tagging schema is not recognised. The schema is expected to consist of the tags B_MIS, I-MIS, ... as used by this BERT model dslim/bert-base-NER · Hugging Face
The tags used by your model such as DAT and EVE are not recognised.
There is an open issue in the Elasticsearch repo to support different tagging schemas for NER, please comment on or +1 the issue so that is may be prioritised
Have you had any success using models for different tasks such as text classification?
I've seen the Third party NLP models page but I guessed it is sufficient for a model to be based on BERT or based on the few other architectures mentioned on that page. If I understand you correctly, there are a list of fixed tags that any compatible model must only use these tags, right?
That page also recommends 3 models you can use for the Named Entity Recognition, I'm guessing those are the ones you had success with.
Yes, but I reached to these models almost randomly! Moreover, I get Segmentation fault (core dumped) error when tried to import dslim/bert-base-NER with Eland but could to import and start dslim/bert-base-NER-uncased successfully. Unfortunately, none of these 3 models help me and what I need is to deploy a NER model for Persian language (called also Farsi, fa), but it seems that there is no compatible model in huggingface
There is an open issue in the Elasticsearch repo to support different tagging schemas for NER, please comment on or +1 the issue so that is may be prioritised
+1 ed.
Have you had any success using models for different tasks such as text classification?
No, at the moment the NER is more important for us. We might to use text classification later.
Oh, does these all means that just those 3 ner models are usable in Elasticsearch ner tasks?!!
I get that error when I've tried to import model by eland on WSL. To reproduce the error to post here, I've executed the below command again but this time the model imported successfully! eland_import_hub_model --url http://localhost:9200/ --hub-model-id dslim/bert-base-NER --task-type ner
We are trying to change/map tags of the hooshvarelab/bert-fa-zwnj-base-ner model to those are acceptable by elastic. Now, I need to know all acceptable tags by elastic for ner tasks, where I can find these tags?
I guess we can map PER, ORG and LOC. Other tags will be mapped to MIS! Some thoughts:
we will miss a lot of useful tags with mapping them to MIS but have no other choice
we gave it a try but yet didn't completed the work
if supporting other tagging schemas is possible by just adding them to an existing list in Elasticsearch's source code, it might be possible to do so and build source code. Searching in the source code I found only this and this one, but they are docs!
this work is kinda poc (prove of concept) and we might need to train our model
A PR has just been merged in Elasticsearch that makes the entity tags configurable and will fix the problem you are having using the hooshvarelab/bert-fa-zwnj-base-ner model.
It will be in the 8.4 release so keep an eye out for that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.