RAG - how is it used? - Context window - does it work out out of the box?

Chenko · March 8, 2024, 11:33am

I am wondering how RAG is used, I see it has a context window in which it adds context, however what is this context?

Another questions I am having is, does RAG work out of the box?
if so how?
if not, what would need to be changed/programmed in order to make this work?

iulia · March 12, 2024, 10:20am

Hi!

I'll go over a general overview of how it works and link some articles to get your started.

RAG stands for Retrieval augmented generation.
It takes the concept of:

Generative AI - where a model (usually an LLM) takes in a question and generates an answer for it.
Information Retrieval - searching for relevant documents within your database/index based on a Query.

Putting those together, when a user asks a question, you first retrieve relevant information that might help answer that question; then you give this information (maybe data entries from your database, some recent articles or news, etc) as context to the LLM.

So now you can generate an answer based on not only the initial question; but the additional information you retrieved as well - giving more accurate responses.

Check out this blog for full definitions and breakdown: What is Retrieval Augmented Generation (RAG)? | A Comprehensive RAG Guide | Elastic

As for if it works out of the box; you need to make some choices for your setup, but you can use out of the box components. You can choose which LLM to use for the generative part; which search engine setup to use for the information retrieval; and how you're going to put it together / serve it to the user.

Here is an example with Elasticsearch, Cohere, and Amazon Bedrock: Retrieval Augmented Generation using Cohere Command model through Amazon Bedrock and domain data in Elasticsearch — Elastic Search Labs

Chenko · March 12, 2024, 10:35am

Thanks for the informational response!

I just have one more question, to be clear.
RAG just does the following?

User types in a query
ES call (With vectorsearch)
Use the top X documents as Context for the LLM
send an api call to the LLM with the users original question and the received documents
LLM returns its response

Is that it/is that all?

iulia · March 12, 2024, 11:18am

In essence yes. Just to make it clear: RAG is a framework / concept not an actual tool. So as long as you're implementing those steps in some way you are indeed performing RAG.

Chenko · March 12, 2024, 11:30am

Perfect, thanks!

system · April 9, 2024, 11:30am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Dec 12th, 2023: [EN] Retrieval Augmented Generation (RAG) for Improving Support Advent Calendar	1	259	January 9, 2024
Generative AI for Search Elasticsearch elastic-stack-machine-learning , vector-search , elastic-ai-assistant	3	504	November 6, 2023
RAG with Elasticsearch - exclusion questions Elastic Search elastic-workplace-search	6	667	March 22, 2024
Cost Optimization with Generative AI Using Elasticsearch Elasticsearch	0	26	December 12, 2024
Not getting much help Meta Elastic	4	58	December 30, 2024

RAG - how is it used? - Context window - does it work out out of the box?

Related topics