I am wondering how RAG is used, I see it has a context window in which it adds context, however what is this context?
Another questions I am having is, does RAG work out of the box?
if so how?
if not, what would need to be changed/programmed in order to make this work?
I'll go over a general overview of how it works and link some articles to get your started.
RAG stands for Retrieval augmented generation.
It takes the concept of:
Generative AI - where a model (usually an LLM) takes in a question and generates an answer for it.
Information Retrieval - searching for relevant documents within your database/index based on a Query.
Putting those together, when a user asks a question, you first retrieve relevant information that might help answer that question; then you give this information (maybe data entries from your database, some recent articles or news, etc) as context to the LLM.
So now you can generate an answer based on not only the initial question; but the additional information you retrieved as well - giving more accurate responses.
As for if it works out of the box; you need to make some choices for your setup, but you can use out of the box components. You can choose which LLM to use for the generative part; which search engine setup to use for the information retrieval; and how you're going to put it together / serve it to the user.
In essence yes. Just to make it clear: RAG is a framework / concept not an actual tool. So as long as you're implementing those steps in some way you are indeed performing RAG.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.