RAGSharp 1.0.0
dotnet add package RAGSharp --version 1.0.0
NuGet\Install-Package RAGSharp -Version 1.0.0
<PackageReference Include="RAGSharp" Version="1.0.0" />
<PackageVersion Include="RAGSharp" Version="1.0.0" />
<PackageReference Include="RAGSharp" />
paket add RAGSharp --version 1.0.0
#r "nuget: RAGSharp, 1.0.0"
#:package RAGSharp@1.0.0
#addin nuget:?package=RAGSharp&version=1.0.0
#tool nuget:?package=RAGSharp&version=1.0.0
RAGSharp: Lightweight RAG Pipeline for .NET
RAGSharp is a lightweight, extensible Retrieval-Augmented Generation (RAG) library built entirely in C#. It provides clean, minimal, and predictable components for building RAG pipelines: document loading, token-aware text splitting, vector storage, and semantic search.
π Why RAGSharp?
Choose RAGSharp if you want:
- Just the RAG essentials - load, chunk, embed, search
- Local-first - Works with any OpenAI-compatible API (OpenAI, LM Studio, Ollama, vLLM, etc.) out of the box.
- No database required - File-based storage out of the box, extensible to Postgres/Qdrant/etc.
- Simple and readable - Straightforward interfaces you can understand quickly
RAGSharp doesnβt aim to replace frameworks like Semantic Kernel or Kernel Memory β instead, it is a minimal, local-friendly pipeline you can understand in one sitting and drop into an existing app.
π¦ Installation
dotnet add package RAGSharp
Dependencies:
OpenAI- Official SDK for embeddingsSharpToken- GPT tokenization for accurate chunkingHtmlAgilityPack- HTML parsing for web loadersMicrosoft.Extensions.Logging- Logging abstractions
β¨ Quick Start
Load documents, index them, and search - in under 15 lines:
using RAGSharp.Embeddings.Providers;
using RAGSharp.IO;
using RAGSharp.RAG;
using RAGSharp.Stores;
//load doc
var docs = await new FileLoader().LoadAsync("sample.txt");
var retriever = new RagRetriever(
embeddings: new OpenAIEmbeddingClient(
baseUrl: "http://127.0.0.1:1234/v1",
apiKey: "lmstudio",
defaultModel: "text-embedding-3-small"
),
store: new InMemoryVectorStore()
);
//ingest
await retriever.AddDocumentsAsync(docs);
//search
var results = await retriever.Search("quantum mechanics", topK: 3);
foreach (var r in results)
Console.WriteLine($"{r.Score:F2}: {r.Content}");
This example uses a local embedding model hosted by (LM Studio) and an in-memory vector store.
βοΈ Architecture.
Every RAG pipeline in RAGSharp follows this flow. Each part is pluggable:
[ DocumentLoader ] β [ TextSplitter ] β [ Embeddings ] β [ VectorStore ] β [ Retriever ]
Thatβs it β the essential building blocks for RAG, without the noise.
π Extensibility
Built on simple interfaces.
For Embeddings:
public interface IEmbeddingClient
{
Task<float[]> GetEmbeddingAsync(string text);
Task<IReadOnlyList<float[]>> GetEmbeddingsAsync(IEnumerable<string> texts);
}
For Vector Store:
public interface IVectorStore
{
Task AddAsync(VectorRecord item);
Task AddBatchAsync(IEnumerable<VectorRecord> items);
Task<IReadOnlyList<SearchResult>> SearchAsync(float[] query, int topK);
bool Contains(string id);
}
No vendor lock-in. Want Claude or Gemini embeddings? Cohere? A Postgres or Qdrant backend? Just implement the interface and drop it in.
π Core Components
1. Document Loading (RAGSharp.IO)
Loaders fetch raw data and convert it into a list of Document objects.
| Loader | Description |
|---|---|
FileLoader |
Load a single text file. |
DirectoryLoader |
Load all files matching a pattern from a directory. |
UrlLoader |
Scrapes a web page (cleans HTML, removes noise) and extracts plain text. |
WebSearchLoader |
Searches and loads content directly from Wikipedia articles. |
Custom loaders: Implement IDocumentLoader for PDFs, Word docs, databases, etc.
2. Embeddings & Tokenization (RAGSharp.Embeddings)
| Component | Description |
|---|---|
OpenAIEmbeddingClient |
Included. Works with any OpenAI-compatible API |
IEmbeddingClient |
2-method interface for custom providers (Claude, Gemini, Cohere, etc.) |
SharpTokenTokenizer |
Included. GPT-family token counting |
ITokenizer |
Interface for custom tokenizers (Uses SharpToken) |
3. Text Splitting (RAGSharp.Text)
RecursiveTextSplitter - Token-aware recursive splitter that respects semantic boundaries:
var tokenizer = new SharpTokenTokenizer("gpt-4");
var splitter = new RecursiveTextSplitter(
tokenizer,
chunkSize: 512, // tokens
chunkOverlap: 50
);
Hereβs how it works:
- Input Text
β
Split by paragraphs (\n\n)
β
For each paragraph:
ββ Fits in chunk size? β Yield whole paragraph
ββ Too large? β Split by sentences
β
For each sentence:
ββ Buffer + sentence fits? β Add to buffer
ββ Too large?
ββ Yield buffer
ββ Single sentence too large? β Token window split
ββ Sliding window with overlap
- Splits by paragraphs first
- Falls back to sentences if chunks too large
- Final fallback: sliding token window with overlap
- Uses
SharpTokenTokenizerfor token counting
Implement ITextSplitter for custom splitting strategies.
4. Vector Stores (RAGSharp.Stores)
| Store | Use Case |
|---|---|
IVectorStore |
4-method interface for any database (Postgres, Qdrant, Redis, etc.) |
InMemoryVectorStore |
Fast, ephemeral storage for demos, testing, and temporary searches. |
FileVectorStore |
Persistent, file-backed storage (uses JSON), perfect for retaining your indexed knowledge base between runs. |
Why no built-in database support?
- Many use cases don't need a database (small knowledge bases, temporary searches)
- Database choices vary widely (Postgres, Qdrant, Redis, SQLite etc)
- Implement IVectorStore in ~50 lines for your preferred database
- Keeps core library lightweight
5. Retriever (RAGSharp.RAG)
The RagRetriever orchestrates the entire pipeline:
- Accepts new documents.
- Chunks them using
ITextSplitter. - Sends chunks to the
IEmbeddingClient. - Stores the resulting vectors in the
IVectorStore. - Performs semantic search (dot product/cosine similarity) against the store based on a query.
- Optional: Pass an
ILoggerto track ingestion progress, by defualt it usesConsoleLogger
π Examples & Documentation
Clone the repo and run the sample app.
git clone https://github.com/mrrazor22/ragsharp
cd ragsharp/SampleApp
dotnet run
| Example | Description |
|---|---|
Example1_QuickStart |
The basic 4-step process: Load File β Init Retriever β Add Docs β Search. |
Example2_FilesSearch |
Loading multiple documents from a directory with persistent storage. |
Example3_WebDocSearch |
Loading content from the web (UrlLoader and WebSearchLoader). |
Example4_Barebones |
Using the low-level API: manual embedding and cosine similarity. |
Example5_Advanced |
Full pipeline customization, injecting custom splitter, logger, and persistent store. |
π Project Structure
RAGSharp/
βββ Embeddings/ (IEmbeddingClient, ITokenizer, providers)
βββ IO/ (FileLoader, DirectoryLoader, UrlLoader, WebSearchLoader)
βββ RAG/ (RagRetriever, IRagRetriever)
βββ Stores/ (InMemoryVectorStore, FileVectorStore)
βββ Text/ (RecursiveTextSplitter, ITextSplitter)
βββ Logging/ (ConsoleLogger)
βββ Utils/ (HashingHelper, VectorExtensions)
SampleApp/
βββ data/ (sample.txt, documents/)
βββ Examples/ (Example1..Example5)
π License
MIT License.
π‘ Status
Early stage, suitable for prototypes and production apps where simplicity matters.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- HtmlAgilityPack (>= 1.12.4)
- Microsoft.Extensions.Logging (>= 9.0.9)
- OpenAI (>= 2.5.0)
- SharpToken (>= 2.0.4)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.0 | 177 | 10/5/2025 |