Dec 15th, 2022: [EN] NLP: Using a Question Answering model to talk to your favorite Christmas song

Natural language processing (NLP) is the branch of artificial intelligence (AI) that aims to understand human language as close as possible to human interpretation by combining computational linguistics with statistical, machine learning and deep learning models.

One of the biggest challenges of NLP was the process of pre-training text data taking into account the variety of language representations.

In 2018, Google sourced a new technique for pre-training NLP called BERT (Bidirectional Encoder Representations from Transformers), no longer requiring data to be processed in any fixed order, allowing training on larger amounts of data and an increased ability to understand context and ambiguity in language.

Like any other pre-training process, the more data the better. Therefore, unlabeled text datasets such as the entire English Wikipedia were used. And then pre-training serves as a layer of "knowledge" to build from.

To support models that use the same tokenizer as BERT, Elastic is supporting the PyTorch library, one of the most popular machine learning libraries that supports neural networks like the Transformer architecture that BERT uses, enabling NLP tasks and incorporating these tasks as part of your data pipeline into Elasticsearch.

In general, any trained model that has a supported architecture is deployable in Elasticsearch, including BERT and variants like:

  • RoBERTa
  • DistilBERT
  • RetriBERT
  • MobileBERT

These models are listed by NLP task.

Currently, these are the supported tasks:

  1. Named entity recognition
  2. Fill-mask
  3. Question answering
  1. Language identification
  2. Text classification
  3. Zero-shot text classification
  1. Text embedding
  2. Text similarity

As in the cases of classification and regression, when a trained model is imported you can use it to make predictions (inference).

For this demo we are going to use one of these tasks -> Information extraction: Question answering.

This task allow us to obtain answers given a context - text - and a question, extracting information from the provided text to answer the provided question.

In this case, not like a chatbot with a conversational flow, but useful to automate the response to frequently asked questions for example, or even to use with a conversational flow when with open-ended questions that were not previously mapped.

What we are going to do now is to import a QA model into your Elastic Stack using our eland library, a Python Elasticsearch client for exploring and analyzing data in Elasticsearch, where we have some simple methods and scripts that allow you to pull models down from the Hugging Face model hub, an AI community to build, train and deploy open source machine learning models.

In this case we will import the model deepset/minilm-uncased-squad2 which is available on Hugging Face.

After the PyTorch model is uploaded into your cluster, you'll be able to allocate this model to a specific machine learning node, loading it into memory and starting native libtorch process.

Once model allocation is complete, we are ready for inference, using the inference processor to evaluate the model. For this demo we are going to use a Christmas lyric as our context, which will allow us to ask questions to our favorite Christmas song.

Let’s start!


To prepare for the demo, we will need an Elasticsearch cluster running at least version 8.3 with an ML node.


Eland can be installed from PyPI via pip.

Before you go any further, make sure you have Python installed.

You can check this by running:

python3 --version

You should get an output like:
Python 3.8.8

Additionally, you’ll need to make sure you have pip available.

You can check this by running:

python3 -m pip --version

You should get an output like:
pip 22.2.2 from …

If you installed Python from source, with an installer from, or via Homebrew you should already have pip.

If you don't have Python and pip installed, install it first.

With that, Eland can be installed from PyPI via pip:

python -m pip install eland

Endpoint information

To interact with your cluster through the API, you need to use your Elasticsearch cluster endpoint information.

The endpoint looks like:

Open your deployment settings to find your endpoint information.

Copy your endpoint, you'll need it later.

QA model

In this demo we will use a random QA model but feel free to import the model you want to use. You can read more details about this model on the Hugging Face webpage.

Copy the model name as in the image below.

Now that we have all the necessary information. Elasticsearch cluster endpoint information and the name of the model we want to import, let's proceed by importing the model:

Open your terminal and update the following command with your endpoint and model name:

eland_import_hub_model --url https://<user>:<password>@<hostname>:<port> \
--hub-model-id <model_name> \
--task-type <task_type>

In this case we are importing the deepset/minilm-uncased-squad2 model to run the question_answering task.

You will see that the Hugging Face model will be loaded directly from the model hub and then your model will be imported into Elasticsearch.

Wait for the process to end.

Let's check if the model was imported.

Open your Kibana Menu and click Machine Learning.

Under model management click Trained Models.

Your model needs to be on this list as shown in the image above, if it is not on this list check if there was any error message in the previous process.

If your model is on this list it means it was imported but now you need to start the deployment. To do this, under Actions click Start deployment or click the play icon.

After deploying, the State column will have the value started and under Actions the Start deployment option will be disabled, which means that the deploy has been done.

Let's test our model!

Copy your model ID deepset__minilm-uncased-squad2.

In the Kibana menu, click Dev Tools .

In this UI you will have a console to interact with your data.

Let's use the inference processor to infer the trained model.

Christmas song

I chose to ask questions for this song: Mariah Carey - All I Want For Christmas Is You

Well, 2022 and No.1 again :christmas_tree:

POST _ml/trained_models/<model_id>/deployment/_infer
  "docs": [{"text_field": "<input>"}],
  "inference_config": {"question_answering": {"question": "<question_to_be_answered>"}}

This POST method contains a docs array with a field matching your configured trained model input, typically the field name is text_field. The text_field value is the input you want to infer. For the QA model as mentioned, in addition to the text that we are going to provide as an input, the question is required, inference_config contains the configuration for inference, for the QA model, the question for your text_field.

In my example adding the lyrics of the song, this will be:

QUESTION: Do I care about presents?

POST _ml/trained_models/deepset__minilm-uncased-squad2/deployment/_infer
    "docs": [{"text_field": "I don't want a lot for Christmas There's just one thing I need I don't care about presents Underneath the Christmas tree I just want you for my own More than you could ever know Make my wish come true All I want for Christmas is you. I don't want a lot for Christmas There is just one thing I need I don't care about presents Underneath the Christmas tree I don't need to hang my stocking There upon the fireplace Santa Claus won't make me happy With a toy on Christmas day I just want you for for my own More than you could ever know Make my wish come true All I want for Christmas is you You baby I won't ask for much this Christmas I won't even wish for snow I'm just gonna keep on waiting Underneath the mistletoe I won't make a list and send it To the North Pole for Saint Nick I won't even stay awake to Hear those magic reindeer click 'Cause I just want you here tonight Holding on to me so tight What more can I do Baby all I want for Christmas is you All the lights are shining So brightly everywhere And the sound of children's Laughter fills the air And everyone is singing I hear those sleigh bells ringing Santa won't you bring me the one I really need Won't you please bring my baby to me Oh, I don't want a lot for Christmas This is all I'm asking for I just want to see my baby Standing right outside my door Oh I just want him for my own More than you could ever know Make my wish come true Baby all I want for Christmas is you All I want for Christmas is you baby All I want for Christmas is you baby."}],
    "inference_config": {"question_answering": {"question": "Do I care about presents?"}}

Clicking the play icon you can send the request.

The answer is shown by the object below:

  "predicted_value": "I don't care about presents Underneath the Christmas tree",
  "start_offset": 63,
  "end_offset": 120,
  "prediction_probability": 0.12568269871332696

The predicted_value contains your answer:

I don't care about presents Underneath the Christmas tree.

In addition you have start_offset and end_offset recording the start and end character offsets of your predicted_value and the probability of this prediction, prediction_probability field.

Let me ask another question... What do I want for Christmas?


Great! Thanks, Mariah Carey.

That's it, the model is working.

You can ask more questions:

  • Will Santa Claus make me happy?
  • Are the lights shining?

Or try using another song.

I hope you enjoy using NLP with the Elastic Stack! Feedback is always welcome.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.