
Hey there
To my friends from DotNet Users Group of Orlando(ONETUG), I really enjoyed our meetup this week. I always enjoy making new friends, growing community, and exploring the edge of technology. As promised, I wanted to share my code samples for Microsoft Semantic Kernel, publishing dockerized workloads to Google Cloud Run, and the related slides.
I also want to offer my thanks to the following friends.
– DotNet Users Group of Orlando(ONETUG). We appreciate the opportunity to co-host events together with Google Developer Group of Central Florida.
– Thank you to Isabella and Employers for Change(E4C) for kindly hosting our meetup groups. If you’re looking for a technical internship, please make sure to connect with Isabella. In many key moments of my career, early stage devs or interns have influenced positive outcomes on my projects. E4C tries to make this happen every day connecting great company cultures with talented minds and hearts. She also offers thoughtful consulting services.
– We appreciate all the fine work of Tech Hub Orlando and InnovateOrlando. Make sure to check out their programs to grow the Orlando Startup community. They have a great event calendar!
Resources
Before we get into technical details, here are the resources I mentioned during the talk:
– Presentation Slides
– Rag and Chat examples with Blazor
– Join our GDG Central Florida Discord
– Join us for DevFest GemJam Hackathon – Oct 25
– https://azure.microsoft.com/en-us/blog/introducing-microsoft-agent-framework/
– Exploring Cloud Run and LangChain
– https://martendb.io/ – This is great for hackathons and innovation projects
– Chris Locurto Podcast – My favorite podcast on business leadership
If you’re just getting started with AI development and wondering what all the fuss is about RAG (Retrieval-Augmented Generation), you’re in the right place. Today we’re going to break down a real-world .NET project that shows you exactly how to build an AI chat system that can answer questions about your own documents.
Don’t worry if terms like “embeddings” or “vector databases” sound scary – by the end of this post, you’ll understand exactly what they are and how to use them in your .NET applications.
What Problem Are We Solving?
Picture this: You have a bunch of text files (maybe documentation, articles, or transcripts), and you want to build a chatbot that can answer questions about them.
The naive approach might be to just dump all your text into ChatGPT’s context window and hope for the best. But there are problems:
- ChatGPT has token limits (you can’t send huge amounts of text)
- It’s expensive to send lots of text every time
- The AI might get confused with too much information at once
RAG solves this by being smart about what information to show the AI. It’s like having a really good librarian who finds the relevant books before you start researching.
The Two-Step Dance: Ingestion + Retrieval
Our solution has two main parts:
- Ingestion (
ContentIngestion/Program.cs
) – Prepare your documents for AI consumption - Retrieval & Generation (
RagChatArea.razor
) – Find relevant info and let AI answer questions
Let’s dive into each part!
Part 1: Document Ingestion – The Setup Phase
Understanding the Basic .NET Structure
Let’s start with ContentIngestion/Program.cs
. If you’re familiar with .NET console applications, this should look pretty standard:
static async Task Main(string[] args)
{
// Create configuration
IConfigurationRoot config = new ConfigurationBuilder()
.AddEnvironmentVariables()
.AddUserSecrets<Program>(optional: true)
.Build();
// Create service collection
var services = new ServiceCollection();
ConfigureServices(services,config);
// Build service provider
using ServiceProvider serviceProvider = services.BuildServiceProvider();
// Get the application instance from the service provider
var app = serviceProvider.GetRequiredService<ConsoleApplication>();
// Run the application
await app.Run();
}
This is the standard pattern for a console app using Dependency Injection (DI). We’re:
- Setting up configuration (reading API keys, connection strings, etc.)
- Registering services in the DI container
- Building the container and running our app
The Key Services We’re Registering
In ConfigureServices
, we register some important services:
// Register text embedding generation service and Postgres vector store.
string textEmbeddingModel = "text-embedding-3-small";
string openAiApiKey = configuration["OPENAI_API_KEY"];
string postgresConnectionString = configuration["DB_CONNECTION"];
services.AddOpenAITextEmbeddingGeneration(textEmbeddingModel, openAiApiKey);
services.AddPostgresVectorStore(postgresConnectionString);
What’s happening here?
– Text Embedding Service: This is our connection to OpenAI’s API that converts text into mathematical vectors
– Vector Store: A special database (PostgreSQL with pgvector extension) that can store and search through these vectors
For beginners: Think of embeddings as a way to convert text into numbers that capture the “meaning” of the text. Similar concepts end up with similar numbers.
Processing Files: The ContentFragmentMaker
Now let’s look at how we actually process text files. The ContentFragmentMaker
class does something really important – it breaks big text files into smaller, manageable pieces:
public List<string> GetChunks(string text, int chunkSize, int overlapSize)
{
List<string> chunks = [];
int start = 0;
while (start < text.Length)
{
int length = Math.Min(chunkSize, text.Length - start);
chunks.Add(text.Substring(start, length));
start += chunkSize - overlapSize;
}
return chunks;
}
Why do we chunk text?
- AI models have limits: You can’t send infinite text to AI models
- Better search: Smaller chunks make it easier to find specific information
- Overlap prevents lost context: The overlap ensures we don’t accidentally split important information
For beginners: Imagine trying to find a recipe in a cookbook. It’s easier to search through individual recipes than trying to scan the entire book at once.
Text Cleaning
Before chunking, we clean up the text:
public string RemoveNonAlphanumeric(string input)
{
return System.Text.RegularExpressions.Regex.Replace(input, @"[^a-zA-Z0-9\s]", "");
}
public string RemoveNewLines(string input)
{
return input.Replace("\n", " ").Replace("\r", " ");
}
This removes special characters and normalizes whitespace. Think of it like preparing ingredients before cooking – you want clean, consistent input.
The DataUploader: Where the Magic Happens
The DataUploader
class is where we convert text into searchable vectors:
public async Task GenerateEmbeddingsAndUpload(
string collectionName,
IEnumerable<ContentItemFragment> fragments)
{
var collection = vectorStore.GetCollection<string, ContentItemFragment>(collectionName);
foreach (var fragment in fragments)
{
// Generate the text embedding.
Console.WriteLine($"Generating embedding for fragment: {fragment.Id}");
fragment.Embedding = await textEmbeddingGenerationService.GenerateEmbeddingAsync(fragment.Content);
// Upload
Console.WriteLine($"Upserting fragment: {fragment.Id}");
await collection.UpsertAsync(fragment);
}
}
What’s happening step by step:
- For each text chunk, call OpenAI’s API to get an embedding (array of numbers)
- Store both the original text AND the embedding in our vector database
- The database can now find similar chunks by comparing these number arrays
The ContentItemFragment Model
Let’s look at our data model:
public class ContentItemFragment
{
[VectorStoreRecordKey(StoragePropertyName = "id")]
public string Id { get; set; }
[VectorStoreRecordData(StoragePropertyName = "content_item_id")]
public Guid ContentItemId { get; set; }
[VectorStoreRecordVector(Dimensions: 4, DistanceFunction.CosineDistance, StoragePropertyName = "embedding")]
public ReadOnlyMemory<float>? Embedding { get; set; }
[VectorStoreRecordData(StoragePropertyName = "content")]
[TextSearchResultValue]
public string Content { get; set; } = string.Empty;
[VectorStoreRecordData(StoragePropertyName = "source")]
[TextSearchResultName]
public string Source { get; set; } = string.Empty;
}
For beginners: These attributes tell the system:
VectorStoreRecordKey
: This is our primary keyVectorStoreRecordVector
: This field stores the embedding (the array of numbers)TextSearchResultValue
: This is the actual text content we’ll show usersTextSearchResultName
: This is like a title or source reference
Part 2: The RAG Chat Interface – Where Users Interact
Now let’s look at RagChatArea.razor
– this is a Blazor component that creates our chat interface.
Setting Up the Chat Brain
When the component initializes, it sets up the “search brain”:
protected override async Task OnInitializedAsync()
{
string openAiApiKey = Configuration["OPENAI_API_KEY"];
string modelId = "gpt-4o-mini";
// Create a kernel with Azure OpenAI chat completion
var builder = Kernel.CreateBuilder();
builder.Services.AddOpenAIChatCompletion(modelId, openAiApiKey);
// Build a text search plugin with vector store search and add to the kernel
var vectorStoreRecordCollection = vectorStore.GetCollection<string, ContentItemFragment>("content_item_fragment");
textSearch = new VectorStoreTextSearch<ContentItemFragment>(vectorStoreRecordCollection, textEmbeddingGeneration);
kernel = builder.Build();
// Build a text search plugin with vector store search and add to the kernel
var searchPlugin = textSearch.CreateWithGetTextSearchResults("SearchPlugin");
kernel.Plugins.Add(searchPlugin);
}
What’s happening here?
- Semantic Kernel Setup: Microsoft’s Semantic Kernel is like a Swiss Army knife for AI development
- Chat Completion: This connects to OpenAI’s GPT models for generating responses
- Search Plugin: This creates a search tool that can find relevant documents
- Plugin Registration: We add the search tool to our AI “kernel” so it can use it
For beginners: Think of Semantic Kernel as a framework that makes it easy to combine AI models with other tools (like search).
The Smart Prompt Template
Here’s where the real magic happens. Instead of just asking ChatGPT a question, we use a template that first searches our documents:
As an AI assistant named Chris, provide a concise and accurate answer to the user's question based on the information retrieved from the text search results below.
You should play the role of a leadership and business coach.
If the information is insufficient, respond with 'I don't know'.
{{#with (SearchPlugin-GetTextSearchResults query)}}
{{#each this}}
Name: {{Name}}
Value: {{Value}}
Link: {{Link}}
-----------------
{{/each}}
{{/with}}
{{query}}
Include citations to the relevant information where it is referenced in the response.
What’s this template doing?
{{#with (SearchPlugin-GetTextSearchResults query)}}
– This automatically searches our documents{{#each this}}
– Loop through each relevant document found- Show the AI the relevant content BEFORE asking it to answer
- Ask for citations so users know where information came from
For beginners: This is using Handlebars templating. The curly braces {{ }}
are placeholders that get filled in with actual data.
The Chat Flow – Step by Step
When a user sends a message, here’s exactly what happens:
private async Task SendMessage()
{
string message = userInput.Trim();
userInput = string.Empty;
// Add user message to history
chatHistory.AddUserMessage(message);
...
try
{
// Get response from AI
await GetAssistantResponse(message);
}
catch (Exception ex)
{
chatHistory.AddAssistantMessage($"I encountered an error: {ex.Message}");
}
finally
{
isLoading = false;
StateHasChanged();
}
}
And in GetAssistantResponse
:
private async Task GetAssistantResponse(string message)
{
string promptTemplate = GetPromptTemplate();
KernelArguments arguments = new() { { "query", message } };
var result = await kernel.InvokePromptAsync(
promptTemplate,
arguments,
templateFormat: HandlebarsPromptTemplateFactory.HandlebarsTemplateFormat,
promptTemplateFactory: promptTemplateFactory
);
var chatResult = result.ToString();
chatHistory.AddMessage(AuthorRole.Assistant, chatResult ?? string.Empty);
}
The step-by-step process:
- User types a question
- The question gets sent to our search plugin
- Search plugin converts the question to an embedding
- Database finds similar document chunks
- Relevant chunks get inserted into our prompt template
- The full prompt (with relevant docs) gets sent to ChatGPT
- ChatGPT responds based on the found documents
- User sees the response with citations
Key .NET Concepts You Should Understand
Dependency Injection
services.AddOpenAITextEmbeddingGeneration(textEmbeddingModel, openAiApiKey);
services.AddPostgresVectorStore(postgresConnectionString);
We register services in the DI container so they can be injected where needed.
Async/Await Pattern
fragment.Embedding = await textEmbeddingGenerationService.GenerateEmbeddingAsync(fragment.Content);
AI operations take time, so we use async programming to avoid blocking the UI.
Configuration System
string openAiApiKey = Configuration["OPENAI_API_KEY"];
.NET’s configuration system lets us read settings from various sources (environment variables, user secrets, etc.).
Blazor Component Lifecycle
protected override async Task OnInitializedAsync()
Blazor components have lifecycle methods where we can set up our services.
Why This Architecture Works Well
Separation of Concerns: Ingestion and chat are separate – you could run ingestion as a batch job and chat as a web service.
Scalability: Vector search is fast even with thousands of documents.
Flexibility: Want to add new documents? Just run the ingestion process again.
Accuracy: The AI can only answer based on your documents, reducing hallucinations.
Common Gotchas for .NET Developers
- Don’t forget to install pgvector extension in your PostgreSQL database
- API costs add up – each embedding call costs money
- Chunk size matters – too small and you lose context, too big and search becomes less precise
- Always handle exceptions when calling external APIs
Next Steps for Learning
If you want to build on this:
1. Experiment with chunk sizes – try different values and see how it affects search quality
2. Add metadata filtering – filter by document type, date, etc.
3. Implement hybrid search – combine vector search with traditional keyword search
4. Add document upload – let users upload their own files through the web interface
5. Improve error handling – add retry logic, better user feedback
Wrapping Up
RAG might sound complicated, but it’s really just three steps:
- Prepare: Break documents into chunks and convert to vectors
- Search: Find relevant chunks when users ask questions
- Generate: Let AI answer based on found information
The .NET ecosystem makes this surprisingly straightforward with libraries like Semantic Kernel and good database support. You don’t need to be an AI expert – you just need to understand how to connect the pieces together.
The key insight is that modern AI works best when you give it relevant, focused information rather than everything at once. RAG is just a systematic way to do that.
Happy coding!
Related Posts
– Building Intelligent Content Workflows with Google’s Agent Development Kit
– Building Simple Agents with .NET and CSharp
– Microsoft Build 2025 – Welcome To Open Agent Enabled Web
Learn more at DevFestFlorida.com
Leave a Reply