AiGeekSquad.AIContext 1.0.14

There is a newer version of this package available.
See the version list below for details.
dotnet add package AiGeekSquad.AIContext --version 1.0.14
                    
NuGet\Install-Package AiGeekSquad.AIContext -Version 1.0.14
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="AiGeekSquad.AIContext" Version="1.0.14" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="AiGeekSquad.AIContext" Version="1.0.14" />
                    
Directory.Packages.props
<PackageReference Include="AiGeekSquad.AIContext" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add AiGeekSquad.AIContext --version 1.0.14
                    
#r "nuget: AiGeekSquad.AIContext, 1.0.14"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package AiGeekSquad.AIContext@1.0.14
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=AiGeekSquad.AIContext&version=1.0.14
                    
Install as a Cake Addin
#tool nuget:?package=AiGeekSquad.AIContext&version=1.0.14
                    
Install as a Cake Tool

AiGeekSquad.AIContext

Build status NuGet Version NuGet Downloads License: MIT

A high-performance C# implementation of the Maximum Marginal Relevance (MMR) algorithm for intelligent document selection. This library provides efficient algorithms for balancing relevance and diversity in search results, making it ideal for AI applications, information retrieval systems, and content recommendation engines.

What is Maximum Marginal Relevance (MMR)?

Maximum Marginal Relevance (MMR) is a re-ranking algorithm designed to reduce redundancy while maintaining query relevance in information retrieval systems. Originally proposed by Carbonell and Goldstein (1998), MMR addresses the fundamental problem of selecting a diverse subset of documents from a larger set of relevant documents.

Mathematical Foundation

The MMR algorithm works by iteratively selecting documents that maximize a linear combination of two criteria:

  • Relevance: Similarity to the query vector (measured using cosine similarity)
  • Diversity: Dissimilarity to already selected documents (promoting variety in results)

The mathematical formulation is:

MMR = λ × Sim(Di, Q) - (1-λ) × max(Sim(Di, Dj))

Where:

  • Di is a candidate document
  • Q is the query vector
  • Dj represents already selected documents
  • λ controls the relevance-diversity trade-off (0.0 to 1.0)

Why MMR is Useful

MMR solves several critical problems in information retrieval:

  1. Redundancy Reduction: Prevents selecting multiple similar documents
  2. Diversity Enhancement: Ensures variety in search results
  3. Relevance Preservation: Maintains connection to the original query
  4. Configurable Balance: Allows tuning between relevance and diversity

Installation

Install the package via NuGet Package Manager:

dotnet add package AiGeekSquad.AIContext

Or via Package Manager Console in Visual Studio:

Install-Package AiGeekSquad.AIContext

Or add directly to your .csproj file:

<PackageReference Include="AiGeekSquad.AIContext" Version="1.0.0" />

Quick Start

using MathNet.Numerics.LinearAlgebra;
using AiGeekSquad.AIContext;

// Create document embeddings (typically from a neural network)
var documents = new List<Vector<double>>
{
    Vector<double>.Build.DenseOfArray(new double[] { 0.8, 0.2, 0.1 }),  // Tech article
    Vector<double>.Build.DenseOfArray(new double[] { 0.7, 0.3, 0.2 }),  // Similar tech article
    Vector<double>.Build.DenseOfArray(new double[] { 0.1, 0.8, 0.3 }),  // Sports article
    Vector<double>.Build.DenseOfArray(new double[] { 0.2, 0.1, 0.9 })   // Arts article
};

// Query vector representing user's interest
var query = Vector<double>.Build.DenseOfArray(new double[] { 0.9, 0.1, 0.0 });

// Select top 3 diverse and relevant documents
var results = MaximumMarginalRelevance.ComputeMMR(
    vectors: documents,
    query: query,
    lambda: 0.7,  // Prefer relevance but include some diversity
    topK: 3
);

// Process results
foreach (var (index, vector) in results)
{
    Console.WriteLine($"Selected document {index} with embedding {vector}");
}

Usage Examples

1. Recommendation System

// For a recommendation system with user preference vector
var userPreferences = Vector<double>.Build.DenseOfArray(new double[] { 0.6, 0.3, 0.1 });
var itemEmbeddings = GetItemEmbeddings(); // Your method to get item vectors

// Get diverse recommendations
var recommendations = MaximumMarginalRelevance.ComputeMMR(
    vectors: itemEmbeddings,
    query: userPreferences,
    lambda: 0.5,  // Balanced relevance and diversity
    topK: 10      // Top 10 recommendations
);

var recommendedItems = recommendations.Select(r => GetItemById(r.index)).ToList();

2. RAG System Context Selection

// For Retrieval Augmented Generation (RAG) systems
var contextCandidates = await GetSemanticSearchResults(userQuery);
var queryEmbedding = await GetQueryEmbedding(userQuery);

// Select diverse context chunks to avoid redundancy
var contextForLLM = MaximumMarginalRelevance.ComputeMMR(
    vectors: contextCandidates.Select(c => c.Embedding).ToList(),
    query: queryEmbedding,
    lambda: 0.8,  // Prioritize relevance for accuracy
    topK: 5       // Limit context size for LLM
);

var finalContext = contextForLLM
    .Select(result => contextCandidates[result.index].Text)
    .ToList();

3. Different Lambda Values

var vectors = GetDocumentVectors();
var query = GetQueryVector();

// Pure relevance (λ = 1.0) - ignores diversity
var relevantResults = MaximumMarginalRelevance.ComputeMMR(vectors, query, lambda: 1.0, topK: 5);

// Balanced approach (λ = 0.5) - equal weight to relevance and diversity
var balancedResults = MaximumMarginalRelevance.ComputeMMR(vectors, query, lambda: 0.5, topK: 5);

// Pure diversity (λ = 0.0) - ignores query relevance
var diverseResults = MaximumMarginalRelevance.ComputeMMR(vectors, query, lambda: 0.0, topK: 5);

Performance Considerations

Time and Space Complexity

  • Time Complexity: O(n²k) where n is the number of input vectors and k is topK
  • Space Complexity: O(n) for similarity caching
  • Query similarities are precomputed once for efficiency

Benchmark Results

Based on comprehensive benchmarks using BenchmarkDotNet on .NET 9.0 with AVX-512 optimizations:

Real Performance Data (from actual benchmark runs):

  • 1,000 vectors, 10 dimensions, topK=5:
    • Pure Relevance (λ=1.0): 1.87ms ± 0.01ms
    • Pure Diversity (λ=0.0): 2.07ms ± 0.05ms
    • Balanced (λ=0.5): 1.89ms ± 0.04ms
  • Memory allocation: ~120KB per 1,000 vectors
  • GC pressure: Minimal (Gen 0/1/2: 0/0/0)

Performance Guidelines

Vector Count Dimensions Recommended topK Expected Performance Memory Usage
< 100 Any Any < 0.5ms < 10KB
100-1,000 < 100 < 20 0.5-5ms 10-200KB
1,000-5,000 < 500 < 20 2-50ms 200KB-2MB
> 5,000 Any < 20 Consider pre-filtering > 2MB

Performance Notes:

  • Lambda values have minimal impact on performance (< 10% difference)
  • Higher dimensions increase memory usage linearly
  • TopK has minimal impact on performance for reasonable values (< 50)

Benchmark Methodology

The performance data above comes from comprehensive benchmarks using:

  • BenchmarkDotNet v0.13.12 for accurate measurements
  • .NET 9.0 runtime with AVX-512 optimizations
  • Multiple GC configurations (Workstation and Server GC)
  • Statistical analysis with confidence intervals
  • Memory diagnostics tracking allocations and GC pressure
  • Reproducible test data using fixed seed (42)

Benchmarks test various parameter combinations:

  • Vector counts: 100, 1,000, 5,000
  • Dimensions: 10, 100, 500
  • TopK values: 5, 10, 20
  • Lambda values: 0.0, 0.5, 1.0

Optimization Tips

  1. Pre-filtering: For large collections (>1000 vectors), consider pre-filtering with approximate similarity search
  2. Vector Normalization: Normalize input vectors to unit length for consistent cosine similarity behavior
  3. Appropriate topK: Set reasonable topK values to balance result quality and computational cost
  4. Lambda Tuning: Use lambda values between 0.3-0.7 for most practical applications

API Documentation

ComputeMMR Method

public static List<(int index, Vector<double> embedding)> ComputeMMR(
    List<Vector<double>> vectors, 
    Vector<double> query, 
    double lambda = 0.5, 
    int? topK = null)
Parameters
Parameter Type Default Description
vectors List<Vector<double>> Required Collection of vectors to select from. All vectors must have the same dimensions as the query vector.
query Vector<double> Required Query vector representing the information need. Must have the same dimensionality as all input vectors.
lambda double 0.5 Controls relevance-diversity trade-off (0.0 to 1.0). Higher values prioritize relevance; lower values prioritize diversity.
topK int? null Maximum number of vectors to select. If null, selects all vectors in MMR order.
Returns

List<(int index, Vector<double> embedding)> - List of tuples containing:

  • index: Zero-based index of the vector in the original input collection
  • embedding: Reference to the original vector object

Results are ordered by selection priority according to the MMR algorithm.

Exceptions
Exception Condition
ArgumentNullException When query vector is null
ArgumentException When lambda is outside [0.0, 1.0] range or vectors have inconsistent dimensions

Lambda Parameter Guide

Lambda Value Behavior Use Case
1.0 Pure relevance - selects vectors most similar to query Precision-focused search
0.7-0.9 Relevance-focused with some diversity Most search applications
0.5 Balanced approach (recommended default) General-purpose usage
0.1-0.3 Diversity-focused with some relevance Content discovery
0.0 Pure diversity - ignores query relevance Maximum variety

Best Practices

Vector Preparation

  1. Normalize vectors to unit length for consistent cosine similarity behavior
  2. Ensure consistent dimensions across all vectors and query
  3. Use appropriate vector representations (embeddings from neural networks, TF-IDF, etc.)

Parameter Selection

  1. Start with λ = 0.5 for balanced results
  2. Adjust λ based on use case:
    • Information retrieval: 0.6-0.8
    • Recommendation systems: 0.4-0.6
    • Content discovery: 0.2-0.4
  3. Set reasonable topK values to balance quality and performance

Integration Patterns

  1. Combine with approximate search for large datasets
  2. Cache query embeddings when processing multiple similar queries
  3. Monitor performance and adjust parameters based on user feedback

Common Use Cases

  • Search Result Diversification: Improve search engines by reducing redundant results
  • Recommendation Systems: Provide diverse product or content recommendations
  • Document Summarization: Select diverse sentences or paragraphs for summaries
  • Content Curation: Avoid redundant information in curated content feeds
  • RAG Systems: Select diverse context chunks for language model prompts
  • Research Paper Recommendation: Ensure topical diversity in academic recommendations

Dependencies

  • MathNet.Numerics (v5.0.0): Used for vector operations and cosine distance calculations
  • .NET 9.0: Target framework

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

We welcome contributions! Please follow these guidelines:

  1. Fork the repository and create a feature branch
  2. Write tests for new functionality
  3. Follow C# coding conventions and maintain code quality
  4. Update documentation for API changes
  5. Submit a pull request with a clear description

Development Setup

# Clone the repository
git clone https://github.com/AiGeekSquad/AIContext.git

# Restore dependencies
dotnet restore

# Build the project
dotnet build

# Run tests
dotnet test

# Run benchmarks (optional)
dotnet run --project src/AiGeekSquad.AIContext.Benchmarks/ --configuration Release

Testing

The project includes comprehensive unit tests covering:

  • Basic functionality with various lambda values
  • Edge cases (empty collections, invalid parameters)
  • Performance tests with larger datasets
  • Dimension validation and error handling

Run tests with:

dotnet test --verbosity normal

Benchmarks

Performance benchmarks are available in the AiGeekSquad.AIContext.Benchmarks project. These benchmarks test various parameter combinations to help you understand performance characteristics for your specific use case.

See Benchmarks README for detailed information.

Acknowledgments

  • Carbonell, J. and Goldstein, J. (1998) - Original MMR algorithm paper: "The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries"
  • MathNet.Numerics - Excellent numerical library for .NET
  • Community contributors - Thank you for your contributions and feedback

Support


Made with ❤️ by AiGeekSquad

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.1 is compatible. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on AiGeekSquad.AIContext:

Package Downloads
AiGeekSquad.AIContext.MEAI

Microsoft Extensions AI Abstractions adapter for AiGeekSquad.AIContext semantic chunking library. Enables seamless integration between Microsoft's AI abstractions and AIContext's semantic text chunking capabilities by providing an adapter that converts between Microsoft's IEmbeddingGenerator interface and AIContext's embedding requirements.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.42 134 8/19/2025
1.0.41 139 8/14/2025
1.0.39 140 8/14/2025
1.0.38 137 8/14/2025
1.0.37 138 8/14/2025
1.0.35 217 8/6/2025
1.0.33 201 8/4/2025
1.0.32 113 7/29/2025
1.0.31 114 7/29/2025
1.0.30 111 7/29/2025
1.0.27 110 7/29/2025
1.0.26 480 7/22/2025
1.0.25 476 7/22/2025
1.0.24 477 7/22/2025
1.0.21 477 7/22/2025
1.0.20 476 7/21/2025
1.0.19 77 7/11/2025
1.0.18 73 7/11/2025
1.0.17 83 7/11/2025
1.0.16 81 7/11/2025
1.0.15 82 7/11/2025
1.0.14 90 7/11/2025