LocalAI.Embedder
0.5.0
See the version list below for details.
dotnet add package LocalAI.Embedder --version 0.5.0
NuGet\Install-Package LocalAI.Embedder -Version 0.5.0
<PackageReference Include="LocalAI.Embedder" Version="0.5.0" />
<PackageVersion Include="LocalAI.Embedder" Version="0.5.0" />
<PackageReference Include="LocalAI.Embedder" />
paket add LocalAI.Embedder --version 0.5.0
#r "nuget: LocalAI.Embedder, 0.5.0"
#:package LocalAI.Embedder@0.5.0
#addin nuget:?package=LocalAI.Embedder&version=0.5.0
#tool nuget:?package=LocalAI.Embedder&version=0.5.0
LocalAI
Philosophy
Start small. Download what you need. Run locally.
// This is all you need. No setup. No configuration. No API keys.
await using var model = await LocalEmbedder.LoadAsync("default");
float[] embedding = await model.EmbedAsync("Hello, world!");
LocalAI is designed around three core principles:
ðŠķ Minimal Footprint
Your application ships with zero bundled models. The base package is tiny. Models, tokenizers, and runtime components are downloaded only when first requested and cached for reuse.
⥠Lazy Everything
First run: LoadAsync("default") â Downloads model â Caches â Runs inference
Next runs: LoadAsync("default") â Uses cached model â Runs inference instantly
No pre-download scripts. No model management. Just use it.
ðŊ Zero Boilerplate
Traditional approach:
// â Without LocalAI: 50+ lines of setup
var tokenizer = LoadTokenizer(modelPath);
var session = new InferenceSession(modelPath, sessionOptions);
var inputIds = tokenizer.Encode(text);
var attentionMask = CreateAttentionMask(inputIds);
var inputs = new List<NamedOnnxValue> { ... };
var outputs = session.Run(inputs);
var embeddings = PostProcess(outputs);
// ... error handling, pooling, normalization, cleanup ...
// â
With LocalAI: 2 lines
await using var model = await LocalEmbedder.LoadAsync("default");
float[] embedding = await model.EmbedAsync("Hello, world!");
Packages
| Package | Description | Status |
|---|---|---|
| LocalAI.Embedder | Text â Vector embeddings | |
| LocalAI.Reranker | Semantic reranking for search | |
| LocalAI.Generator | Text generation & chat | |
| LocalAI.Transcriber | Speech â Text (Whisper) | ð Planned |
| LocalAI.Synthesizer | Text â Speech | ð Planned |
| LocalAI.Translator | Neural machine translation | ð Planned |
| LocalAI.Detector | Object detection | ð Planned |
| LocalAI.Segmenter | Image segmentation | ð Planned |
| LocalAI.Ocr | Document OCR | ð Planned |
| LocalAI.Captioner | Image â Text | ð Planned |
Quick Start
Text Embeddings
using LocalAI.Embedder;
await using var model = await LocalEmbedder.LoadAsync("default");
// Single text
float[] embedding = await model.EmbedAsync("Hello, world!");
// Batch processing
float[][] embeddings = await model.EmbedBatchAsync(new[]
{
"First document",
"Second document",
"Third document"
});
// Similarity
float similarity = model.CosineSimilarity(embeddings[0], embeddings[1]);
Semantic Reranking
using LocalAI.Reranker;
await using var reranker = await LocalReranker.LoadAsync("default");
var results = await reranker.RerankAsync(
query: "What is machine learning?",
documents: new[]
{
"Machine learning is a subset of artificial intelligence...",
"The weather today is sunny and warm...",
"Deep learning uses neural networks..."
},
topK: 2
);
foreach (var result in results)
{
Console.WriteLine($"[{result.Score:F4}] {result.Document}");
}
Text Generation
using LocalAI.Generator;
// Simple generation
var generator = await TextGeneratorBuilder.Create()
.WithDefaultModel() // Phi-3.5 Mini
.BuildAsync();
string response = await generator.GenerateCompleteAsync("What is machine learning?");
Console.WriteLine(response);
// Chat format
var messages = new[]
{
new ChatMessage(ChatRole.System, "You are a helpful assistant."),
new ChatMessage(ChatRole.User, "Explain quantum computing simply.")
};
string chatResponse = await generator.GenerateChatCompleteAsync(messages);
// Streaming
await foreach (var token in generator.GenerateAsync("Write a story:"))
{
Console.Write(token);
}
Available Models
Embedder
| Alias | Model | Dimensions | Size |
|---|---|---|---|
default |
all-MiniLM-L6-v2 | 384 | ~90MB |
large |
all-mpnet-base-v2 | 768 | ~420MB |
multilingual |
paraphrase-multilingual-MiniLM-L12-v2 | 384 | ~470MB |
Reranker
| Alias | Model | Max Tokens | Size |
|---|---|---|---|
default |
ms-marco-MiniLM-L-6-v2 | 512 | ~90MB |
quality |
ms-marco-MiniLM-L-12-v2 | 512 | ~134MB |
fast |
ms-marco-TinyBERT-L-2-v2 | 512 | ~18MB |
multilingual |
bge-reranker-v2-m3 | 8192 | ~1.1GB |
Generator
| Alias | Model | Parameters | License |
|---|---|---|---|
default |
Phi-3.5-mini-instruct | 3.8B | MIT |
fast |
Llama-3.2-1B-Instruct | 1B | Llama 3.2 |
quality |
phi-4 | 14B | MIT |
small |
Llama-3.2-1B-Instruct | 1B | Llama 3.2 |
GPU Acceleration
GPU acceleration is automatic when detected:
// Auto-detect (default) - uses GPU if available, falls back to CPU
var options = new EmbedderOptions { Provider = ExecutionProvider.Auto };
// Force specific provider
var options = new EmbedderOptions { Provider = ExecutionProvider.Cuda }; // NVIDIA
var options = new EmbedderOptions { Provider = ExecutionProvider.DirectML }; // Windows GPU
var options = new EmbedderOptions { Provider = ExecutionProvider.CoreML }; // macOS
Install the appropriate ONNX Runtime package for GPU support:
dotnet add package Microsoft.ML.OnnxRuntime.Gpu # NVIDIA CUDA
dotnet add package Microsoft.ML.OnnxRuntime.DirectML # Windows (AMD, Intel, NVIDIA)
dotnet add package Microsoft.ML.OnnxRuntime.CoreML # macOS
Model Caching
Models are cached following HuggingFace Hub conventions:
- Default:
~/.cache/huggingface/hub - Environment variables:
HF_HUB_CACHE,HF_HOME, orXDG_CACHE_HOME - Manual override:
new EmbedderOptions { CacheDirectory = "/path/to/cache" }
Requirements
- .NET 10.0+
- Windows, Linux, or macOS
Documentation
License
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Release Process
Releases are automated via GitHub Actions when Directory.Build.props is updated:
- Update the
<Version>inDirectory.Build.props - Commit and push to main
- CI automatically publishes all packages to NuGet and creates a GitHub release
Requires NUGET_API_KEY secret configured in GitHub repository settings.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- LocalAI.Core (>= 0.5.0)
- System.Numerics.Tensors (>= 10.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.