RAG - how is it used? - Context window - does it work out out of the box?

I am wondering how RAG is used, I see it has a context window in which it adds context, however what is this context?

Another questions I am having is, does RAG work out of the box?
if so how?
if not, what would need to be changed/programmed in order to make this work?

Hi!

I'll go over a general overview of how it works and link some articles to get your started.

RAG stands for Retrieval augmented generation.
It takes the concept of:

  • Generative AI - where a model (usually an LLM) takes in a question and generates an answer for it.
  • Information Retrieval - searching for relevant documents within your database/index based on a Query.

Putting those together, when a user asks a question, you first retrieve relevant information that might help answer that question; then you give this information (maybe data entries from your database, some recent articles or news, etc) as context to the LLM.

So now you can generate an answer based on not only the initial question; but the additional information you retrieved as well - giving more accurate responses.

Check out this blog for full definitions and breakdown: What is Retrieval Augmented Generation (RAG)? | A Comprehensive RAG Guide | Elastic

As for if it works out of the box; you need to make some choices for your setup, but you can use out of the box components. You can choose which LLM to use for the generative part; which search engine setup to use for the information retrieval; and how you're going to put it together / serve it to the user.

Here is an example with Elasticsearch, Cohere, and Amazon Bedrock: Retrieval Augmented Generation using Cohere Command model through Amazon Bedrock and domain data in Elasticsearch — Elastic Search Labs

1 Like

Thanks for the informational response!

I just have one more question, to be clear.
RAG just does the following?

  1. User types in a query
  2. ES call (With vectorsearch)
  3. Use the top X documents as Context for the LLM
  4. send an api call to the LLM with the users original question and the received documents
  5. LLM returns its response

Is that it/is that all?

In essence yes. Just to make it clear: RAG is a framework / concept not an actual tool. So as long as you're implementing those steps in some way you are indeed performing RAG.

1 Like

Perfect, thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.