My task is to return, at my request, the passages I need from my menu data from the restaurant dataset.
for example, I ask the question “do you have pizza WITHOUT mushrooms?”
I need a semantic analysis of the data to happen and passages from the data are returned to me, items from the menu in which there are NO mushrooms!!
(semantic analysis should occur precisely at the retriever stage, in order not to pollute the content with useless information for AI in the next step)
I'm not sure if this is exactly what you are asking, but one option that you have is to use a self-query retriever, which would take a query in natural language and translate it into an actual Elasticsearch query using an LLM. We have a Python notebook with an example implementation in our Search Labs repository on GitHub. The examples uses Langchain with our Elasticsearch integration.
Hi, @Miguel_Grinberg !
Thanks for your response. My project is a voice assistant and response time is critical, so I would not like to use rephasing or any other extra requests to LLM, due to its high time cost.
Wondering if there are any other directions I coud dive in to achive my goal
Maybe I don't have all the context that you have, but what I'm proposing involves a single call to the LLM. Basically the sequence goes like this:
The LLM receives the query from the user, enhanced with a prompt that asks for a translation to an Elasticsearch query.
The generated query is sent to Elasticsearch, and you get your results back.
That is it, no extra LLM calls, just the one.
If this does not work and you prefer to not incorporate a GenAI step in the process then I guess you could build a query language for your domain, but this would have to be custom. You could write the query as "pizza without mushrooms" let's say, and the word "without" here is recognized as a keyword, with the left and right sides assumed to be items. I do not know of any off-the-shelf solution for building this type of query languages, you may need to roll your own.
thanks @Miguel_Grinberg , I've got your point.
Just as I build a conversational bot, I want to have human-like answers generated by LLM.
So currently I have simple RAG flow - find docs, send to LLM, get buituful answer. And it works nice for common question like "Do you have pizza with pineapple".
But in cases of exclussion (Not Spicy, without some ingredient) or follow-up (is it good for children, suggest another option) questions - vector search results are not appropriate , what is expected.
In case of follow-up questions I understand that I can play with context and use some tricky injections/filtering over what is passed to the LLM, but with exclussion - I feel stuck...
Okay, so if you are going to use a RAG approach there are no magical solutions. First of all you need to make sure that your vector search turns up relevant results from which an answer can be summarized. If this does not happen then I guess you need to find better embeddings, or maybe go with a hybrid search approach that adds results from a standard BM25 search to those coming from vector similarity.
Maybe you have to ask your customers what they want to eat in general terms first (ie. "pizza"), so that you can do a more generic retrieval, basically so that the LLM has more info to play with. Then get the details (ie. "without mushrooms") as a follow up that does not affect or only complements the initial retrieval.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.