LocalEmbedder 0.3.1

dotnet add package LocalEmbedder --version 0.3.1
                    
NuGet\Install-Package LocalEmbedder -Version 0.3.1
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="LocalEmbedder" Version="0.3.1" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="LocalEmbedder" Version="0.3.1" />
                    
Directory.Packages.props
<PackageReference Include="LocalEmbedder" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add LocalEmbedder --version 0.3.1
                    
#r "nuget: LocalEmbedder, 0.3.1"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package LocalEmbedder@0.3.1
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=LocalEmbedder&version=0.3.1
                    
Install as a Cake Addin
#tool nuget:?package=LocalEmbedder&version=0.3.1
                    
Install as a Cake Tool

LocalEmbedder

CI NuGet Downloads License

A simple, high-performance .NET library for local text embeddings with automatic model downloading from HuggingFace.

await using var model = await LocalEmbedder.LoadAsync("all-MiniLM-L6-v2");
var embedding = await model.EmbedAsync("Hello, world!");

Features

  • Zero Configuration - Just specify a model name, auto-downloads from HuggingFace
  • High Performance - SIMD-optimized operations with TensorPrimitives
  • Minimal Dependencies - Only ONNX Runtime and System.Numerics.Tensors
  • GPU Acceleration - CUDA, DirectML, CoreML support
  • Pre-configured Models - Popular embedding models ready to use
  • Resume Downloads - Automatically resumes interrupted downloads

Installation

dotnet add package LocalEmbedder

Quick Start

using LocalEmbedder;

// Load a model (auto-downloads if not cached)
await using var model = await LocalEmbedder.LoadAsync("all-MiniLM-L6-v2");

// Generate embedding
float[] embedding = await model.EmbedAsync("Your text here");

// Batch processing
float[][] embeddings = await model.EmbedAsync([
    "First sentence",
    "Second sentence",
    "Third sentence"
]);

Note: Both await using (recommended) and using patterns are supported for resource disposal.

Available Models

Model ID Dimensions Description
all-MiniLM-L6-v2 384 Fast, good quality general-purpose
all-mpnet-base-v2 768 High quality general-purpose
bge-small-en-v1.5 384 BAAI's efficient model
bge-base-en-v1.5 768 BAAI's high quality model
multilingual-e5-small 384 Multilingual support
multilingual-e5-base 768 Multilingual, high quality
// List all available models
foreach (var modelId in LocalEmbedder.GetAvailableModels())
{
    Console.WriteLine(modelId);
}

Configuration

var model = await LocalEmbedder.LoadAsync("all-MiniLM-L6-v2", new EmbedderOptions
{
    CacheDirectory = "./models",           // Model cache location
    MaxSequenceLength = 512,               // Max tokens
    NormalizeEmbeddings = true,            // L2 normalization
    Provider = ExecutionProvider.Auto      // Auto-select best (default)
});

Execution Providers

ExecutionProvider.Auto      // Default - automatically selects best available
ExecutionProvider.Cpu       // CPU only
ExecutionProvider.Cuda      // NVIDIA GPU
ExecutionProvider.DirectML  // Windows GPU (AMD, Intel, NVIDIA)
ExecutionProvider.CoreML    // Apple Silicon

The default Auto provider automatically selects:

  1. CUDA if NVIDIA GPU is available (highest performance)
  2. DirectML on Windows or CoreML on macOS if supported
  3. CPU as fallback (always available)

Advanced Usage

Custom HuggingFace Model

// Load any ONNX model from HuggingFace
var model = await LocalEmbedder.LoadAsync("intfloat/multilingual-e5-large");

Local Model File

// Load from local path
var model = await LocalEmbedder.LoadAsync("/path/to/model.onnx");

Dependency Injection

// Register as singleton
services.AddSingleton<IEmbeddingModel>(sp =>
    LocalEmbedder.LoadAsync("all-MiniLM-L6-v2").GetAwaiter().GetResult());

// Or use factory
services.AddSingleton<IEmbedderFactory, EmbedderFactory>();
using var model = await LocalEmbedder.LoadAsync("all-MiniLM-L6-v2");

var query = await model.EmbedAsync("What is machine learning?");
var documents = await model.EmbedAsync([
    "Machine learning is a subset of AI",
    "The weather is nice today",
    "Deep learning uses neural networks"
]);

// Find most similar
var similarities = documents.Select(doc => 
    LocalEmbedder.CosineSimilarity(query, doc));

API Reference

IEmbeddingModel

public interface IEmbeddingModel : IDisposable
{
    string ModelId { get; }
    int Dimensions { get; }
    
    ValueTask<float[]> EmbedAsync(string text, CancellationToken ct = default);
    ValueTask<float[][]> EmbedAsync(IReadOnlyList<string> texts, CancellationToken ct = default);
}

Utility Methods

// Cosine similarity between two vectors
float similarity = LocalEmbedder.CosineSimilarity(vec1, vec2);

// Euclidean distance
float distance = LocalEmbedder.EuclideanDistance(vec1, vec2);

// Dot product
float dot = LocalEmbedder.DotProduct(vec1, vec2);

Performance Tips

  1. Reuse model instances - Loading is expensive, embed calls are cheap
  2. Batch your requests - EmbedAsync(string[]) is more efficient than multiple single calls
  3. Use GPU - Significant speedup for large batches
  4. Choose the right model - Smaller models (MiniLM) are much faster than large ones (E5-large)

Requirements

  • .NET 10.0 or later
  • ONNX Runtime native libraries (included via NuGet)

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (2)

Showing the top 2 NuGet packages that depend on LocalEmbedder:

Package Downloads
FileFlux

Complete document processing SDK optimized for RAG systems. Transform PDF, DOCX, Excel, PowerPoint, Markdown and other formats into high-quality chunks with intelligent semantic boundary detection. Includes advanced chunking strategies, metadata extraction, and performance optimization.

FluxCurator

Text preprocessing library for RAG pipelines: PII masking, content filtering, and intelligent chunking including semantic chunking with Korean language support.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.3.1 321 11/26/2025
0.3.0 170 11/26/2025
0.2.0 163 11/26/2025
0.1.1 166 11/24/2025
0.1.0 176 11/24/2025