LocalAI.Embedder 0.6.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package LocalAI.Embedder --version 0.6.0
                    
NuGet\Install-Package LocalAI.Embedder -Version 0.6.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="LocalAI.Embedder" Version="0.6.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="LocalAI.Embedder" Version="0.6.0" />
                    
Directory.Packages.props
<PackageReference Include="LocalAI.Embedder" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add LocalAI.Embedder --version 0.6.0
                    
#r "nuget: LocalAI.Embedder, 0.6.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package LocalAI.Embedder@0.6.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=LocalAI.Embedder&version=0.6.0
                    
Install as a Cake Addin
#tool nuget:?package=LocalAI.Embedder&version=0.6.0
                    
Install as a Cake Tool

LocalAI

CI NuGet NuGet NuGet License: MIT

Philosophy

Start small. Download what you need. Run locally.

// This is all you need. No setup. No configuration. No API keys.
await using var model = await LocalEmbedder.LoadAsync("default");
float[] embedding = await model.EmbedAsync("Hello, world!");

LocalAI is designed around three core principles:

ðŸŠķ Minimal Footprint

Your application ships with zero bundled models. The base package is tiny. Models, tokenizers, and runtime components are downloaded only when first requested and cached for reuse.

⚡ Lazy Everything

First run:  LoadAsync("default") → Downloads model → Caches → Runs inference
Next runs:  LoadAsync("default") → Uses cached model → Runs inference instantly

No pre-download scripts. No model management. Just use it.

ðŸŽŊ Zero Boilerplate

Traditional approach:

// ❌ Without LocalAI: 50+ lines of setup
var tokenizer = LoadTokenizer(modelPath);
var session = new InferenceSession(modelPath, sessionOptions);
var inputIds = tokenizer.Encode(text);
var attentionMask = CreateAttentionMask(inputIds);
var inputs = new List<NamedOnnxValue> { ... };
var outputs = session.Run(inputs);
var embeddings = PostProcess(outputs);
// ... error handling, pooling, normalization, cleanup ...
// ✅ With LocalAI: 2 lines
await using var model = await LocalEmbedder.LoadAsync("default");
float[] embedding = await model.EmbedAsync("Hello, world!");

Packages

Package Description Status
LocalAI.Embedder Text → Vector embeddings NuGet
LocalAI.Reranker Semantic reranking for search NuGet
LocalAI.Generator Text generation & chat NuGet
LocalAI.Ocr Document OCR 📋 Planned
LocalAI.Captioner Image → Text 📋 Planned
LocalAI.Detector Object detection 📋 Planned
LocalAI.Translator Neural machine translation 📋 Planned
LocalAI.Segmenter Image segmentation 📋 Planned
LocalAI.Transcriber Speech → Text (Whisper) 📋 Planned
LocalAI.Synthesizer Text → Speech 📋 Planned

Quick Start

Text Embeddings

using LocalAI.Embedder;

await using var model = await LocalEmbedder.LoadAsync("default");

// Single text
float[] embedding = await model.EmbedAsync("Hello, world!");

// Batch processing
float[][] embeddings = await model.EmbedBatchAsync(new[]
{
    "First document",
    "Second document",
    "Third document"
});

// Similarity
float similarity = model.CosineSimilarity(embeddings[0], embeddings[1]);

Semantic Reranking

using LocalAI.Reranker;

await using var reranker = await LocalReranker.LoadAsync("default");

var results = await reranker.RerankAsync(
    query: "What is machine learning?",
    documents: new[]
    {
        "Machine learning is a subset of artificial intelligence...",
        "The weather today is sunny and warm...",
        "Deep learning uses neural networks..."
    },
    topK: 2
);

foreach (var result in results)
{
    Console.WriteLine($"[{result.Score:F4}] {result.Document}");
}

Text Generation

using LocalAI.Generator;

// Simple generation
var generator = await TextGeneratorBuilder.Create()
    .WithDefaultModel()  // Phi-3.5 Mini
    .BuildAsync();

string response = await generator.GenerateCompleteAsync("What is machine learning?");
Console.WriteLine(response);

// Chat format
var messages = new[]
{
    new ChatMessage(ChatRole.System, "You are a helpful assistant."),
    new ChatMessage(ChatRole.User, "Explain quantum computing simply.")
};

string chatResponse = await generator.GenerateChatCompleteAsync(messages);

// Streaming
await foreach (var token in generator.GenerateAsync("Write a story:"))
{
    Console.Write(token);
}

Available Models

Updated: 2025-01 based on MTEB leaderboard and community benchmarks

Embedder

Alias Model Dims Params Context Best For
default bge-small-en-v1.5 384 33M 512 Balanced speed/quality
fast all-MiniLM-L6-v2 384 22M 256 Ultra-low latency
quality bge-base-en-v1.5 768 110M 512 Higher accuracy
large nomic-embed-text-v1.5 768 137M 8192 Long context RAG
multilingual multilingual-e5-base 768 278M 512 100+ languages

Reranker

Alias Model Params Context Best For
default ms-marco-MiniLM-L-6-v2 22M 512 Balanced speed/quality
fast ms-marco-TinyBERT-L-2-v2 4.4M 512 Ultra-low latency
quality bge-reranker-base 278M 512 Higher accuracy
large bge-reranker-large 560M 512 Best accuracy
multilingual bge-reranker-v2-m3 568M 8192 Long docs, 100+ languages

Generator

Alias Model Params Context License Best For
default Phi-4-mini-instruct 3.8B 16K MIT Balanced reasoning
fast Llama-3.2-1B-Instruct 1B 8K Llama 3.2 Ultra-fast inference
quality phi-4 14B 16K MIT Best reasoning
medium Phi-3.5-mini-instruct 3.8B 128K MIT Long context
multilingual gemma-2-2b-it 2B 8K Gemma ToU Multi-language

GPU Acceleration

GPU acceleration is automatic when detected:

// Auto-detect (default) - uses GPU if available, falls back to CPU
var options = new EmbedderOptions { Provider = ExecutionProvider.Auto };

// Force specific provider
var options = new EmbedderOptions { Provider = ExecutionProvider.Cuda };     // NVIDIA
var options = new EmbedderOptions { Provider = ExecutionProvider.DirectML }; // Windows GPU
var options = new EmbedderOptions { Provider = ExecutionProvider.CoreML };   // macOS

Install the appropriate ONNX Runtime package for GPU support:

dotnet add package Microsoft.ML.OnnxRuntime.Gpu       # NVIDIA CUDA
dotnet add package Microsoft.ML.OnnxRuntime.DirectML  # Windows (AMD, Intel, NVIDIA)
dotnet add package Microsoft.ML.OnnxRuntime.CoreML    # macOS

Model Caching

Models are cached following HuggingFace Hub conventions:

  • Default: ~/.cache/huggingface/hub
  • Environment variables: HF_HUB_CACHE, HF_HOME, or XDG_CACHE_HOME
  • Manual override: new EmbedderOptions { CacheDirectory = "/path/to/cache" }

Requirements

  • .NET 10.0+
  • Windows, Linux, or macOS

Documentation


License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Release Process

Releases are automated via GitHub Actions when Directory.Build.props is updated:

  1. Update the <Version> in Directory.Build.props
  2. Commit and push to main
  3. CI automatically publishes all packages to NuGet and creates a GitHub release

Requires NUGET_API_KEY secret configured in GitHub repository settings.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.7.2 160 12/15/2025
0.7.1 90 12/15/2025
0.7.0 152 12/14/2025
0.6.0 102 12/13/2025
0.5.0 96 12/13/2025
0.4.0 104 12/13/2025