AiGeekSquad.AIContext 1.0.21

There is a newer version of this package available.
See the version list below for details.

dotnet add package AiGeekSquad.AIContext --version 1.0.21

NuGet\Install-Package AiGeekSquad.AIContext -Version 1.0.21

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="AiGeekSquad.AIContext" Version="1.0.21" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="AiGeekSquad.AIContext" Version="1.0.21" />
                    

                            Directory.Packages.props

<PackageReference Include="AiGeekSquad.AIContext" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add AiGeekSquad.AIContext --version 1.0.21

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: AiGeekSquad.AIContext, 1.0.21"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package AiGeekSquad.AIContext@1.0.21

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=AiGeekSquad.AIContext&version=1.0.21
                    

                            Install as a Cake Addin

#tool nuget:?package=AiGeekSquad.AIContext&version=1.0.21
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

AiGeekSquad.AIContext

A comprehensive C# library for AI-powered context management, providing intelligent text processing capabilities for modern AI applications. This library combines semantic text chunking and Maximum Marginal Relevance (MMR) algorithms to help you build better RAG systems, search engines, and content recommendation platforms.

✨ Key Features

🧠 Semantic Text Chunking: Intelligent text splitting based on semantic similarity analysis
🎯 Maximum Marginal Relevance (MMR): High-performance algorithm for relevance-diversity balance
🛠️ Extensible Architecture: Dependency injection ready with clean interfaces
📊 High Performance: Optimized for .NET 9.0 with comprehensive benchmarks

🚀 Quick Start

Installation

dotnet add package AiGeekSquad.AIContext

Basic Usage

Semantic Text Chunking

using AiGeekSquad.AIContext.Chunking;

// Create a chunker with your embedding provider
var tokenCounter = new MLTokenCounter();
var embeddingGenerator = new YourEmbeddingProvider(); // Implement IEmbeddingGenerator

var chunker = SemanticTextChunker.Create(tokenCounter, embeddingGenerator);

// Configure chunking for your use case
var options = new SemanticChunkingOptions
{
    MaxTokensPerChunk = 512,
    MinTokensPerChunk = 10,
    BreakpointPercentileThreshold = 0.75 // Higher = more semantic breaks
};

// Process a document with metadata
var text = @"
Artificial intelligence is transforming how we work and live. Machine learning
algorithms can process vast amounts of data to find patterns humans might miss.

In the business world, companies adopt AI for customer service, fraud detection,
and process automation. Chatbots handle routine inquiries while algorithms
detect suspicious transactions in real-time.";

var metadata = new Dictionary<string, object>
{
    ["Source"] = "AI Technology Overview",
    ["DocumentId"] = "doc-123"
};

await foreach (var chunk in chunker.ChunkAsync(text, metadata, options))
{
    Console.WriteLine($"Chunk {chunk.StartIndex}-{chunk.EndIndex}:");
    Console.WriteLine($"  Text: {chunk.Text.Trim()}");
    Console.WriteLine($"  Tokens: {chunk.Metadata["TokenCount"]}");
    Console.WriteLine($"  Segments: {chunk.Metadata["SegmentCount"]}");
    Console.WriteLine();
}

Maximum Marginal Relevance for Diverse Results

using MathNet.Numerics.LinearAlgebra;
using AiGeekSquad.AIContext.Ranking;

// Simulate document embeddings (from your vector database)
var documents = new List<Vector<double>>
{
    Vector<double>.Build.DenseOfArray(new double[] { 0.9, 0.1, 0.0 }), // ML intro
    Vector<double>.Build.DenseOfArray(new double[] { 0.85, 0.15, 0.0 }), // Advanced ML (similar!)
    Vector<double>.Build.DenseOfArray(new double[] { 0.1, 0.8, 0.1 }), // Sports content
    Vector<double>.Build.DenseOfArray(new double[] { 0.0, 0.1, 0.9 }) // Cooking content
};

var documentTitles = new[]
{
    "Introduction to Machine Learning",
    "Advanced Machine Learning Techniques", // Very similar to first
    "Basketball Training Guide",
    "Italian Cooking Recipes"
};

// User query: interested in machine learning
var query = Vector<double>.Build.DenseOfArray(new double[] { 0.9, 0.1, 0.0 });

// Compare pure relevance vs MMR
Console.WriteLine("Pure Relevance (λ = 1.0):");
var pureRelevance = MaximumMarginalRelevance.ComputeMMR(
    vectors: documents, query: query, lambda: 1.0, topK: 3);

foreach (var (index, score) in pureRelevance)
    Console.WriteLine($"  {documentTitles[index]} (score: {score:F3})");

Console.WriteLine("\nMMR Balanced (λ = 0.7):");
var mmrResults = MaximumMarginalRelevance.ComputeMMR(
    vectors: documents, query: query, lambda: 0.7, topK: 3);

foreach (var (index, score) in mmrResults)
    Console.WriteLine($"  {documentTitles[index]} (score: {score:F3})");

// MMR avoids selecting both similar ML documents!

🎯 Real-World Examples

Complete RAG System Pipeline

using AiGeekSquad.AIContext.Chunking;
using AiGeekSquad.AIContext.Ranking;

// 1. INDEXING: Chunk documents for vector storage
var documents = new[] { "AI research paper content...", "ML tutorial content..." };
var allChunks = new List<TextChunk>();

foreach (var doc in documents)
{
    await foreach (var chunk in chunker.ChunkAsync(doc, metadata))
    {
        allChunks.Add(chunk);
        // Store chunk.Text and embedding in your vector database
    }
}

// 2. RETRIEVAL: User asks a question
var userQuestion = "What are the applications of machine learning?";
var queryEmbedding = await embeddingGenerator.GenerateEmbeddingAsync(userQuestion);

// Get candidate chunks from vector database (similarity search)
var candidates = await vectorDb.SearchSimilarAsync(queryEmbedding, topK: 20);

// 3. CONTEXT SELECTION: Use MMR for diverse, relevant context
var selectedContext = MaximumMarginalRelevance.ComputeMMR(
    vectors: candidates.Select(c => c.Embedding).ToList(),
    query: queryEmbedding,
    lambda: 0.8,  // Prioritize relevance but ensure diversity
    topK: 5       // Limit context for LLM token limits
);

// 4. GENERATION: Send to LLM with selected context
var contextText = string.Join("\n\n",
    selectedContext.Select(s => candidates[s.Index].Text));

var prompt = $"Context:\n{contextText}\n\nQuestion: {userQuestion}\nAnswer:";
var response = await llm.GenerateAsync(prompt);

Smart Document Processing

// Custom splitter for legal documents
var legalSplitter = SentenceTextSplitter.WithPattern(
    @"(?<=\d+\.)\s+(?=[A-Z])"); // Split on numbered sections

var chunker = SemanticTextChunker.Create(tokenCounter, embeddingGenerator, legalSplitter);

// Process with domain-specific options
var options = new SemanticChunkingOptions
{
    MaxTokensPerChunk = 1024,  // Larger chunks for legal context
    BreakpointPercentileThreshold = 0.8  // More conservative splitting
};

await foreach (var chunk in chunker.ChunkAsync(legalDocument, metadata, options))
{
    // Each chunk maintains legal context integrity
    await indexService.AddChunkAsync(chunk);
}

Content Recommendation with Diversity

// User has read these articles (represented as embeddings)
var userHistory = new List<Vector<double>> { /* user's read articles */ };

// Available articles to recommend
var availableArticles = new List<(string title, Vector<double> embedding)>
{
    ("Machine Learning Basics", mlBasicsEmbedding),
    ("Advanced ML Techniques", advancedMlEmbedding),  // Similar to above
    ("Data Science Career Guide", dataScienceEmbedding),
    ("Python Programming Tips", pythonEmbedding)
};

// User's interests (derived from their history)
var userInterestVector = ComputeUserInterestVector(userHistory);

// Get diverse recommendations (avoid recommending similar content)
var recommendations = MaximumMarginalRelevance.ComputeMMR(
    vectors: availableArticles.Select(a => a.embedding).ToList(),
    query: userInterestVector,
    lambda: 0.6,  // Balance relevance with diversity
    topK: 3
);

foreach (var (index, score) in recommendations)
{
    Console.WriteLine($"Recommended: {availableArticles[index].title}");
}

⚙️ Configuration

Chunking Options

Option	Default	Description
`MaxTokensPerChunk`	512	Maximum tokens per chunk
`MinTokensPerChunk`	10	Minimum tokens per chunk
`BreakpointPercentileThreshold`	0.75	Semantic breakpoint sensitivity
`BufferSize`	1	Context window for embedding generation
`EnableEmbeddingCaching`	true	Cache embeddings for performance

Custom Text Splitters

The SentenceTextSplitter class provides intelligent sentence boundary detection:

Default Pattern: Optimized for English text with built-in handling of common titles and abbreviations
Handled Abbreviations: Mr., Mrs., Ms., Dr., Prof., Sr., Jr.
Custom Patterns: Create domain-specific splitters for specialized content

// Default splitter - handles English titles automatically
var defaultSplitter = SentenceTextSplitter.Default;

// Custom pattern for numbered sections (e.g., legal documents)
var customSplitter = SentenceTextSplitter.WithPattern(@"(?<=\.)\s+(?=\d+\.)");

// Use with semantic chunker
var chunker = SemanticTextChunker.Create(tokenCounter, embeddingGenerator, customSplitter);

Note: The default pattern prevents incorrect sentence breaks after common English titles like "Dr. Smith" or "Mrs. Johnson", ensuring better semantic coherence in your chunks.

🏗️ Core Interfaces

Implement these interfaces to integrate with your AI infrastructure:

// Implement for your embedding provider
public interface IEmbeddingGenerator
{
    IAsyncEnumerable<Vector<double>> GenerateBatchEmbeddingsAsync(
        IEnumerable<string> texts, 
        CancellationToken cancellationToken = default);
}

// Implement for custom text splitting
public interface ITextSplitter
{
    IAsyncEnumerable<TextSegment> SplitAsync(
        string text, 
        CancellationToken cancellationToken = default);
}

// Real token counting
public interface ITokenCounter
{
    Task<int> CountTokensAsync(string text, CancellationToken cancellationToken = default);
}

📊 Performance

Semantic Chunking: Streaming processing with IAsyncEnumerable for large documents
MMR Algorithm: ~2ms for 1,000 vectors, ~120KB memory allocation
Token Counting: Real GPT-4 compatible tokenizer using Microsoft.ML.Tokenizers

📦 Dependencies

MathNet.Numerics (v5.0.0): Vector operations and similarity calculations
Microsoft.ML.Tokenizers (v0.22.0): Real tokenization for accurate token counting
.NET 9.0: Target framework for optimal performance

📖 Additional Resources

Repository: Source code and development information
MMR Documentation: Detailed MMR algorithm documentation
Examples: Sample implementations and use cases
API Reference: Complete API documentation

🌟 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Wiki

Built with ❤️ for the AI community by AiGeekSquad

Product	Compatible and additional computed target framework versions.
.NET	net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.
.NET Core	netcoreapp3.0 was computed. netcoreapp3.1 was computed.
.NET Standard	netstandard2.1 is compatible.
MonoAndroid	monoandroid was computed.
MonoMac	monomac was computed.
MonoTouch	monotouch was computed.
Tizen	tizen60 was computed.
Xamarin.iOS	xamarinios was computed.
Xamarin.Mac	xamarinmac was computed.
Xamarin.TVOS	xamarintvos was computed.
Xamarin.WatchOS	xamarinwatchos was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

.NETStandard 2.1
- MathNet.Numerics (>= 5.0.0)
- Microsoft.ML.Tokenizers (>= 1.0.2)
- Microsoft.ML.Tokenizers.Data.Cl100kBase (>= 1.0.2)

NuGet packages (1)

Showing the top 1 NuGet packages that depend on AiGeekSquad.AIContext:

Package	Downloads
AiGeekSquad.AIContext.MEAI Microsoft Extensions AI Abstractions adapter for AiGeekSquad.AIContext semantic chunking library. Enables seamless integration between Microsoft's AI abstractions and AIContext's semantic text chunking capabilities by providing an adapter that converts between Microsoft's IEmbeddingGenerator interface and AIContext's embedding requirements.	2.7K

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
1.0.32	0	7/29/2025
1.0.31	0	7/29/2025
1.0.30	0	7/29/2025
1.0.27	0	7/29/2025
1.0.26	469	7/22/2025
1.0.25	468	7/22/2025
1.0.24	469	7/22/2025
1.0.21	469	7/22/2025
1.0.20	467	7/21/2025
1.0.19	69	7/11/2025
1.0.18	65	7/11/2025
1.0.17	74	7/11/2025
1.0.16	71	7/11/2025
1.0.15	73	7/11/2025
1.0.14	82	7/11/2025