Mythosia.AI.Rag 7.2.0

dotnet add package Mythosia.AI.Rag --version 7.2.0
                    
NuGet\Install-Package Mythosia.AI.Rag -Version 7.2.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Mythosia.AI.Rag" Version="7.2.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Mythosia.AI.Rag" Version="7.2.0" />
                    
Directory.Packages.props
<PackageReference Include="Mythosia.AI.Rag" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Mythosia.AI.Rag --version 7.2.0
                    
#r "nuget: Mythosia.AI.Rag, 7.2.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Mythosia.AI.Rag@7.2.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Mythosia.AI.Rag&version=7.2.0
                    
Install as a Cake Addin
#tool nuget:?package=Mythosia.AI.Rag&version=7.2.0
                    
Install as a Cake Tool

Mythosia.AI.Rag

Package Summary

Mythosia.AI.Rag provides RAG (Retrieval-Augmented Generation) as an optional extension for Mythosia.AI.
Install this package to add .WithRag() to any IAIService — no changes to the AI core required.

Abstractions Compatibility: Implements Mythosia.AI.Rag.Abstractions v5.x

Installation

dotnet add package Mythosia.AI.Rag

Quick Start

using Mythosia.AI.Rag;

var service = new AnthropicService(apiKey, httpClient)
    .WithRag(rag => rag
        .AddDocument("manual.txt")
        .AddDocument("policy.txt")
    );

var response = await service.GetCompletionAsync("What is the refund policy?");

That's it. Documents are automatically loaded, chunked, embedded, and indexed on the first query (lazy initialization).

Document Sources

.WithRag(rag => rag
    // Single file
    .AddDocument("docs/manual.txt")

    // All files in a directory (recursive)
    .AddDocuments("./knowledge-base/")

    // Per-extension routing in a directory
    .AddDocuments("./knowledge-base/", src => src
        .WithExtension(".pdf")
        .WithLoader(new PdfDocumentLoader())
        .WithTextSplitter(new CharacterTextSplitter(800, 80))
    )
    .AddDocuments("./knowledge-base/", src => src
        .WithExtension(".docx")
        .WithLoader(new WordDocumentLoader())
        .WithTextSplitter(new TokenTextSplitter(600, 60))
    )

    // Inline text
    .AddText("Product price is $99.", id: "price-info")

    // URL (fetched via HTTP GET)
    .AddUrl("https://example.com/faq.txt")

    // Custom loader
    .AddDocuments(new MyPdfLoader(), "docs/manual.pdf")
)

Search Settings

.WithRag(rag => rag
    .AddDocument("docs.txt")
    .WithTopK(5)              // Number of results to retrieve (default: 3)
    .WithChunkSize(500)       // Characters per chunk (default: 300)
    .WithChunkOverlap(50)     // Overlap between chunks (default: 30)
    .WithScoreThreshold(0.5)  // Minimum similarity score (default: none)
)

Combine dense vector similarity with BM25 keyword matching using Reciprocal Rank Fusion (RRF). Documents that rank highly in both keyword and semantic search are boosted to the top.

For stores that support native hybrid storage/search, the recommended model is:

  • store both dense and sparse/keyword-searchable data at write time
  • choose retrieval mode at query time
    • SearchAsync for vector-only retrieval
    • HybridSearchAsync for hybrid retrieval

If a store does not support native hybrid retrieval, the RAG layer falls back to application-level fusion automatically.

.WithRag(rag => rag
    .AddDocument("docs.txt")
    .UseHybridSearch()            // Enable hybrid search (default weight: 0.5)
)

Adjust the balance between vector and keyword search:

.UseHybridSearch(vectorWeight: 0.7f)  // 70% vector, 30% keyword

How It Works

Store Type Behavior
InMemoryVectorStore Application-level BM25 index + vector search, merged via RRF
PostgresStore Native parallel tsvector full-text + pgvector similarity, merged via RRF
QdrantStore Native sparse-dense prefetch + Qdrant's built-in RRF fusion
PineconeStore Native dense + sparse server-side fusion on dotproduct indexes

The strategy is selected automatically based on the store — no configuration needed.

To revert to pure vector search:

.UseVectorSearch()  // Explicit pure vector mode (same as default)

Re-ranking

Re-rank search results after retrieval for improved relevance. Works with both pure vector and hybrid search.

When a reranker is configured, the pipeline automatically fetches a wider candidate pool (TopK × TopKMultiplier) and then the reranker selects the best TopK results. This ensures the reranker has enough diversity to work with.

// Default: retrieves TopK × 3 candidates, reranks down to TopK
.WithRag(rag => rag
    .AddDocument("docs.txt")
    .WithReranker(new CohereReranker(cohereApiKey))
)

// Custom multiplier via RagStore.UpdateOptions
store.UpdateOptions(opt => opt.DefaultQuery.RetrievalDerivation.TopKMultiplier = 5);

Cohere Reranker

using Mythosia.AI.Rag.Reranking;

.WithRag(rag => rag
    .AddDocument("docs.txt")
    .WithReranker(new CohereReranker(cohereApiKey))
)

LLM-based Reranker

Use any existing AIService to score and reorder results:

using Mythosia.AI.Rag.Reranking;

var scorer = new OpenAIService(apiKey, httpClient, AIModel.OpenAI_Gpt4oMini);

.WithRag(rag => rag
    .AddDocument("docs.txt")
    .WithReranker(new LlmReranker(scorer))
)

vLLM Reranker

Use a vLLM-served reranker model (e.g., Qwen3-Reranker):

using Mythosia.AI.Rag.Reranking;

.WithRag(rag => rag
    .AddDocument("docs.txt")
    .WithReranker(new VllmReranker(
        model: "Qwen/Qwen3-Reranker-0.6B",
        baseUrl: "http://localhost:8003"))
)

Final Selection Policy

By default, the pipeline trusts the reranker's scores for final result selection (RerankerOnly). Use WithFinalSelectionPolicy to blend retrieval and reranker scores instead:

.WithRag(rag => rag
    .AddDocument("docs.txt")
    .WithReranker(new CohereReranker(cohereApiKey))
    .WithFinalSelectionPolicy(RagFinalSelectionMode.WeightedBlend, retrievalWeight: 0.65)
)

Combined: Hybrid Search + Re-ranking

.WithRag(rag => rag
    .AddDocument("docs.txt")
    .UseHybridSearch(vectorWeight: 0.6f)
    .WithReranker(new CohereReranker(cohereApiKey))
)

Embedding Providers

// Local feature-hashing (default, no API key required)
.UseLocalEmbedding(dimensions: 1024)

// OpenAI embedding API
.UseOpenAIEmbedding(apiKey, model: "text-embedding-3-small", dimensions: 1536)

// vLLM-served embedding model
.UseEmbedding(new VllmEmbeddingProvider(
    httpClient,
    model: "Qwen/Qwen3-Embedding-0.6B",
    dimensions: 1024,
    baseUrl: "http://localhost:8002"))

// Custom provider
.UseEmbedding(new MyCustomEmbeddingProvider())

Vector Stores

// In-memory (default, data lost on process exit)
.UseInMemoryStore()

// Custom store (e.g., Qdrant, Chroma, Pinecone)
.UseStore(new MyQdrantVectorStore())

Prompt Templates

.WithPromptTemplate(@"
[Reference Documents]
{context}

[Question]
{question}

Answer based only on the provided documents.
")

Use {context} and {question} placeholders. If no template is specified, a default numbered-reference format is used.

Multi-Turn Conversations (Query Rewriting)

By default, follow-up questions like "Tell me more about that" fail in RAG because the search query lacks context from previous turns. WithQueryRewriter() solves this by automatically rewriting follow-up queries into retrieval-ready form before vector search, and can also derive keyword terms for hybrid/text retrieval.

var service = new OpenAIService(apiKey, httpClient)
    .WithRag(rag => rag
        .AddDocument("manual.txt")
        .WithQueryRewriter()   // Enables automatic query rewriting and retrieval keyword derivation
    );

// Turn 1: "Do you know about OPM?" → RAG finds OPM documents ✓
var r1 = await service.GetCompletionAsync("Do you know about OPM?");

// Turn 2: "Tell me more about that" → rewritten to "Tell me more about OPM" → RAG finds OPM documents ✓
var r2 = await service.GetCompletionAsync("Tell me more about that");

Use a cheaper/smaller LLM for rewriting and retrieval keyword derivation to reduce cost:

var rewriterService = new OpenAIService(apiKey, httpClient, AIModel.OpenAI_Gpt4oMini);

var service = new OpenAIService(apiKey, httpClient, AIModel.OpenAI_Gpt4o)
    .WithRag(rag => rag
        .AddDocument("manual.txt")
        .WithQueryRewriter(new LlmQueryRewriter(rewriterService))
    );

You can also provide a fully custom IQueryRewriter implementation:

.WithRag(rag => rag
    .AddDocument("manual.txt")
    .WithQueryRewriter(new MyCustomRewriter())
)

Inspect the rewritten query via RagProcessedQuery.RewrittenQuery:

var result = await service.RetrieveAsync("Tell me more about that");
Console.WriteLine(result.RewrittenQuery);  // "Tell me more about OPM"

Streaming

var ragService = new OpenAIService(apiKey, httpClient)
    .WithRag(rag => rag.AddDocument("manual.txt"));

await foreach (var chunk in ragService.StreamAsync("How do I use this product?"))
{
    Console.Write(chunk);
}

Document Indexing Callback

BuildAsync accepts an optional onDocumentEmbedded callback invoked after each document's embedding is complete. When omitted, the pipeline automatically calls ReplaceByFilterAsync(Where("document_id", docId), records) — which deletes all existing chunks for that document and inserts the new ones atomically. When provided, the callback replaces this default behavior entirely — you decide how to persist the records.

On PostgresStore, ReplaceByFilterAsync wraps DELETE + INSERT in a single transaction — queries always see either the old data or the new data, never an empty gap. Other stores (InMemory, Qdrant, Pinecone) perform sequential delete + insert via the default interface method.

Custom Processing

Use the callback for logging, validation, or routing to different stores:

var store = await RagStore.BuildAsync(config => config
    .AddDocuments("./docs/")
    .UseOpenAIEmbedding(apiKey)
    .UseStore(vectorStore),
    onDocumentEmbedded: async records =>
    {
        Console.WriteLine($"Indexed {records.Count} chunks");
        await vectorStore.UpsertBatchAsync(records);
    }
);

Agentic RAG

In standard RAG the pipeline runs once per user message. In Agentic RAG the agent decides when to search, what to search for, and whether to search again if the first result is insufficient — all autonomously inside a ReAct loop.

Register the RagStore as a search tool with WithAgenticRag, then run RunAgentAsync:

// Build the index once
var ragStore = await RagStore.BuildAsync(cfg => cfg
    .AddDocument("manual.pdf")
    .AddDocument("policy.docx")
    .UseOpenAIEmbedding(apiKey));

// Register RAG as a tool and run the agent
var service = new AnthropicService(apiKey, http);
service.WithAgenticRag(ragStore);

var answer = await service.RunAgentAsync("Summarise the refund policy.");

Combining with Other Tools

service.WithAgenticRag(ragStore)
       .WithFunctionAsync("get_order_status", "Look up an order status by order ID.",
           ("order_id", "The order ID to look up.", required: true),
           async id => await orderApi.GetStatusAsync(id));

// The agent searches documents for policy AND calls the API for live order data
var answer = await service.RunAgentAsync(
    "Order #12345 — am I eligible for a refund based on the current policy?");

Custom Tool Description

The tool description controls when the agent decides to call RAG. Tailor it to your domain:

service.WithAgenticRag(ragStore,
    toolDescription:
        "Search internal HR policies, product manuals, and compliance documents. " +
        "Call this tool whenever company-specific policy or product information is needed.");

How It Differs from Standard RAG

Standard RAG Agentic RAG
Search timing Every message Agent decides
Query formulation QueryRewriter Agent itself
Number of searches Once per turn One or more as needed
Tool combination Not applicable Any registered tool
Setup .WithRag() .WithAgenticRag() + RunAgentAsync

QueryRewriter is intentionally bypassed in Agentic RAG. The agent formulates its own self-contained search query, so a separate rewriting step is redundant and could distort the agent's intent.

Shared RagStore (Multiple Services)

Build the index once, share across multiple AI services:

var ragStore = await RagStore.BuildAsync(config => config
    .AddDocuments("./knowledge-base/")
    .UseOpenAIEmbedding(embeddingApiKey)
    .WithTopK(5)
);

var claude = new AnthropicService(claudeKey, http).WithRag(ragStore);
var gpt = new OpenAIService(gptKey, http).WithRag(ragStore);

// Both use the same pre-built index
var resp1 = await claude.GetCompletionAsync("What is the refund policy?");
var resp2 = await gpt.GetCompletionAsync("How long does shipping take?");

Runtime Options Update

Update pipeline options at runtime without rebuilding the index:

ragStore.UpdateOptions(opt =>
{
    opt.DefaultQuery.FinalFilter.TopK = 8;
    opt.DefaultQuery.FinalFilter.MinScore = 0.4;
    opt.DefaultQuery.RetrievalDerivation.TopKMultiplier = 3;
    opt.PromptTemplate = @"
[Reference Documents]
{context}

[Question]
{question}

Answer based only on the provided documents.
";
});

Disable RAG Per-Request

var ragService = service.WithRag(rag => rag.AddDocument("doc.txt"));

// Use RAG
var withRag = await ragService.GetCompletionAsync("question with context");

// Temporarily bypass RAG
var withoutRag = await ragService.WithoutRag().GetCompletionAsync("general question");

Retrieve Without LLM Call

Inspect the request message content and references before sending to the LLM:

var result = await ragService.RetrieveAsync("What is the refund policy?");

if (result.HasReferences)
{
    Console.WriteLine(result.RequestMessageContent);  // Context + query
    Console.WriteLine(result.References.Count);        // Number of matched chunks
    Console.WriteLine($"FinalTopK={result.Diagnostics.FinalTopK}, RetrievalTopK={result.Diagnostics.RetrievalTopK}, FinalMinScore={result.Diagnostics.AppliedFinalMinScore}, Namespace={result.Diagnostics.AppliedNamespace}, Elapsed={result.Diagnostics.ElapsedMs}ms");
    foreach (var r in result.References)
    {
        Console.WriteLine($"Score: {r.Score:F4} | {r.Record.Content}");
    }
}
else
{
    // No references found — RequestMessageContent contains the original query unchanged
    Console.WriteLine(result.RequestMessageContent);
}

Per-Request Query Overrides

Keep global defaults in RagBuilder, then override per request when needed:

var ragStore = await RagStore.BuildAsync(config => config
    .AddDocuments("./knowledge-base/")
    .WithTopK(3)
    .WithScoreThreshold(0.5)
);

var normal = await ragStore.QueryAsync("refund policy?");

var highRecall = await ragStore.QueryAsync(
    "refund policy?",
    new RagQueryOptions
    {
        FinalFilter = new RagFilter { TopK = 15, MinScore = 0.2 }
    }
);

Store-Level Metadata Filtering (StoreFilter)

Use RagQueryOptions.StoreFilter to pass a VectorFilter directly to IVectorStore.SearchAsync / HybridSearchAsync on every retrieval call. This allows scoping retrieval by tenant, user, category, time range, or any metadata key without wrapping the store in a custom decorator.

// Single condition
var options = new RagQueryOptions();
options.StoreFilter = new VectorFilter().Where("storage_id", storageId);
var result = await ragStore.QueryAsync("질문", options, cancellationToken);

// Multiple conditions — storage_id AND folder_path (AND logic, fluent chaining)
options.StoreFilter = new VectorFilter()
    .Where("storage_id", storageId)
    .Where("folder_path", "/docs/private");

// Multi-value filter — only documents from specific tenants
options.StoreFilter = new VectorFilter()
    .WhereIn("storage_id", tenantId1, tenantId2, tenantId3);

// Namespace + metadata simultaneously
options.Namespace = "tenant-A";
options.StoreFilter = new VectorFilter().Where("user_id", currentUserId);
// → retrieval uses: namespace = "tenant-A" AND metadata->>'user_id' = '...'

For this to work, the metadata keys must be stored on VectorRecord.Metadata at index time:

var doc = new RagDocument
{
    Content = "문서 내용...",
    Metadata = new Dictionary<string, string>
    {
        ["storage_id"] = storageId,
        ["user_id"] = userId
    }
};
await ragPipeline.IndexDocumentAsync(doc, cancellationToken: ct);

StoreFilter = null (the default) preserves the existing behavior with no filtering.

Progress Reporting

Track pipeline stage progress with an async callback:

var result = await ragStore.QueryAsync("refund policy?",
    new RagQueryOptions
    {
        ProgressAsync = stage =>
        {
            Console.WriteLine($"Stage: {stage}");
            return Task.CompletedTask;
        }
    });

Architecture

Mythosia.AI.Abstractions              <- IAIService interface
    |
Mythosia.AI.Rag.Abstractions         <- interfaces (IRagPipeline, ITextSplitter, etc.), RagDocument
    |
Mythosia.AI.Rag                      <- fluent API, pipeline, builders, extensions
Mythosia.VectorDb.InMemory (optional) <- InMemoryVectorStore
Mythosia.Documents.Abstractions      <- IDocumentLoader, DoclingDocument

The AI core has zero knowledge of RAG. Everything is wired through the IRagPipeline interface and C# extension methods.

Custom Implementations

Custom Embedding Provider

public class MyEmbeddingProvider : IEmbeddingProvider
{
    public int Dimensions => 768;

    public Task<float[]> GetEmbeddingAsync(string text, CancellationToken ct = default)
    {
        // Your embedding logic
    }

    public Task<IReadOnlyList<float[]>> GetEmbeddingsAsync(IEnumerable<string> texts, CancellationToken ct = default)
    {
        // Batch embedding logic
    }
}

Custom Vector Store

public class MyVectorStore : IVectorStore
{
    // Implement: CreateCollectionAsync, UpsertAsync, SearchAsync, DeleteAsync, etc.
}

Custom Document Loader

public class MyPdfLoader : IDocumentLoader
{
    public Task<IReadOnlyList<DoclingDocument>> LoadAsync(string source, CancellationToken ct = default)
    {
        // Parse PDF and return documents
    }
}
Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.1 is compatible. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
7.2.0 29 4/3/2026
7.1.0 35 4/2/2026
7.0.1 61 4/1/2026
7.0.0 90 3/30/2026
6.2.0 86 3/29/2026
6.1.0 85 3/28/2026
6.0.1 83 3/24/2026
6.0.0 87 3/22/2026
5.0.1 129 3/15/2026
5.0.0 98 3/15/2026 5.0.0 is deprecated because it has critical bugs.
4.0.0 102 3/11/2026 4.0.0 is deprecated because it has critical bugs.
3.1.0 100 3/7/2026 3.1.0 is deprecated because it has critical bugs.
3.0.0 85 3/6/2026
2.0.0 85 3/5/2026
1.2.0 95 3/2/2026
1.1.0 92 2/28/2026
1.0.0 97 2/25/2026

v7.2.0: Internal namespace/scope handling migrated to Metadata (follows VectorDb.Abstractions v4.0.0). Default indexing now uses ReplaceByFilterAsync instead of UpsertBatchAsync, fixing stale chunk problem on re-indexing. User-facing API unchanged.