How to create your own RAG applications in R

jeudi 17 juillet 2025, 11:00 , par InfoWorld

One of the handiest tasks large language models can do for us is answer questions about a specific collection of information. This is often done using a technique called RAG, or retrieval augmented generation. Instead of relying on what the model knows from its training data, a RAG application searches for the most relevant parts of a document collection, then uses only those text chunks as context for the LLM’s response.

Now, thanks to some relatively new R packages, it’s easy to create your own RAG applications in R. You can even combine RAG with conventional dplyr-like filtering to make responses more relevant, although that requires additional setup and code.

This tutorial gets you started creating RAG applications in R. First, we’ll cover how to prepare, chunk, store, and query a document with basic RAG, using information about Workshops for Ukraine for our demo. You’ll quickly be able to ask general questions like “Tell me three workshops that would help me improve my R data visualization skills” and get a relevant response. Next, we’ll layer on some pre-filtering to answer slightly more specific questions like “What R-related workshops are happening next month?”

More about ragnar
See my introduction to 3 of the best LLM integration tools for R for an overview of RAG for R.

The 5 steps of building a RAG app

There are five basic steps for building a RAG application with the ragnar and ellmer R packages:

Turn documents into a markdown format that ragnar can process.

Split the markdown text into chunks, optionally adding any metadata you might want to filter on (we won’t do the optional part yet).

Create a ragnar data store and insert your markdown chunks into the store. That insertion process automatically includes adding embeddings with each chunk (embeddings use a lengthy string of numbers to represent a text chunk’s semantic meaning).

Embed a query and retrieve text chunks that are most relevant to that query.

Send those chunks along with the original query to an LLM and ask the model to generate a response.

Let’s get started!

Set up your development environment

To start, you will need to install the ellmer and ragnar packages if you want to follow the examples. ellmer is the main tidyverse R package for using large language models in R. ragnar is specifically designed for RAG and works with ellmer.

I suggest installing the latest development versions of both—especially ragnar, since useful new features are being added somewhat frequently. You can do that with pak::pak('tidyverse/ragnar') and pak::pak('tidyverse/ellmer'). I’m also using the dplyr, purrr, stringr, and rio R packages, which can all be installed from CRAN with install.packages().

I’ll be using OpenAI models both to generate embeddings and ask questions, so you’ll need an OpenAI API key to use the example code. If you want to use an Anthropic or Google Gemini model to generate the final answers, you’ll also need an API key from that provider. While it’s possible to run the example with a local LLM using ollama, your results may not be as good.

ragnar updates
ragnar added a new data store architecture just prior to publication in July 2025, to support more sophisticated text chunking and retrieval. Thanks to package creator Tomasz Kalinowski at Posit for his help updating some of the code in this article.

Steps 1 and 2: Wrangle the ‘Workshops for Ukraine’ data

Workshops for Ukraine is a two-hour data science webinar series where volunteers teach a specific topic or skill, often R-related. The goal is to raise money for Ukraine, so participants donate at least $20 or €20 to one of several charities. Participants can attend live or get access to past recordings and materials.

The workshops are listed on a single web page hosted on Google Sites. Our first task is to import the web page using ragnar, which includes several functions for importing web pages and other document formats such as PDFs, Word, and Excel.

In the code below, read_as_markdown() converts the web page into markdown, then markdown_chunk() splits that into chunks. The segment_by_heading_levels = 3 argument splits the text using the original HTML H3 headers, so that each new row is a workshop.

library(ragnar)
library(dplyr, warn.conflicts = FALSE)
library(stringr)
workshop_url
markdown_chunk(
target_size = NA,
segment_by_heading_levels = 3
) |>
filter(str_starts(text, '### '))

Why did I use H3s to split the HTML text? Because I examined the workshop HTML page structure, and it looked like each workshop had its own H3 HTML header. Always check the format, because other web pages may have a different format.

The final filter deletes any rows without a level-3 heading, because those aren’t workshops.

Data frame generated by the read_as_markdown() and markdown_chunk() functions.Sharon Machlis

The resulting data frame has columns for text, context (header and potentially other information), and start and end locations. The start and end locations help ragnar handle chunk overlapping, which can help retain semantic meaning across text segments.

Step 3: Create a data store and insert chunks

Now I’m ready to create a data store and add my chunks. The code below creates a simple ragnar data store that is set up to use OpenAI’s text-embedding-3-small model when creating embeddings for each chunk. The embed_ollama() instructs the app to use a local ollama embedding model if one is installed on your system. ragnar uses DuckDB for its data store database.

store_file_location

Lire la suite sur InfoWorld