Retrievo.AzureOpenAI 0.3.0-preview.1

This is a prerelease version of Retrievo.AzureOpenAI.

There is a newer prerelease version of this package available.
See the version list below for details.

dotnet add package Retrievo.AzureOpenAI --version 0.3.0-preview.1

NuGet\Install-Package Retrievo.AzureOpenAI -Version 0.3.0-preview.1

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="Retrievo.AzureOpenAI" Version="0.3.0-preview.1" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="Retrievo.AzureOpenAI" Version="0.3.0-preview.1" />
                    

                            Directory.Packages.props

<PackageReference Include="Retrievo.AzureOpenAI" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add Retrievo.AzureOpenAI --version 0.3.0-preview.1

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: Retrievo.AzureOpenAI, 0.3.0-preview.1"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package Retrievo.AzureOpenAI@0.3.0-preview.1

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=Retrievo.AzureOpenAI&version=0.3.0-preview.1&prerelease
                    

                            Install as a Cake Addin

#tool nuget:?package=Retrievo.AzureOpenAI&version=0.3.0-preview.1&prerelease
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Retrievo

Hybrid search for .NET — BM25 + vectors + RRF fusion, zero infrastructure

Retrievo is an open-source, in-process, in-memory search library for .NET that combines BM25 lexical matching with vector similarity search. Results are merged via Reciprocal Rank Fusion (RRF) into a single ranked list — no external servers, no databases, no infrastructure. Designed for corpora up to ~10k documents: local agent memory, small RAG pipelines, developer tools, and offline/edge scenarios.

Quick Install

dotnet add package Retrievo --prerelease

For Azure OpenAI embeddings:

dotnet add package Retrievo.AzureOpenAI --prerelease

Key Features

Core Search

Hybrid Retrieval: Combine BM25 and cosine similarity using RRF fusion.
Standalone Modes: Use lexical-only or vector-only search when needed.
Explain Mode: Detailed score breakdown for every search result.
Fielded Search: Title and body fields with independent boost weights.
Metadata Filters: Exact-match, range, and contains filtering post-fusion.
Field Definitions: Declare field types (String, StringArray) at index time for automatic filter semantics.
Finite Vector Validation: Rejects NaN/Infinity embeddings and query vectors with clear exceptions.

Index Management

Fluent Builder: Clean API for batch construction and folder ingestion.
Mutable Index: Incremental upserts and deletes with thread-safe commits.
Zero Infrastructure: Runs entirely in-process with no external dependencies.
Auto-Embedding: Transparently embed documents at index time.

Developer Experience

SIMD Accelerated: Hardware-intrinsics for fast brute-force vector math.
Query Diagnostics: Detailed timing breakdown for every pipeline stage.
Pluggable Providers: Easy integration with any embedding model or API.
CLI Tool: Powerful terminal interface for indexing and querying.

Quick Start

Build an index and search in a few lines:

using Retrievo;
using Retrievo.Models;

var index = new HybridSearchIndexBuilder()
    .AddDocument(new Document { Id = "1", Body = "Neural networks learn complex patterns." })
    .AddDocument(new Document { Id = "2", Body = "Kubernetes orchestrates container deployments." })
    .Build();

using var _ = index;
var response = index.Search(new HybridQuery { Text = "neural network training", TopK = 5 });

foreach (var r in response.Results)
    Console.WriteLine($"  {r.Id}: {r.Score:F4}");

Field Definitions

Declare field types at index time so filters automatically use the right matching strategy:

using var index = new HybridSearchIndexBuilder()
    .DefineField("tags", FieldType.StringArray)         // pipe-delimited by default
    .DefineField("categories", FieldType.StringArray, delimiter: ',')
    .AddDocument(new Document
    {
        Id = "1",
        Body = "Deep learning fundamentals",
        Metadata = new Dictionary<string, string>
        {
            ["tags"] = "ml|deep-learning|neural-nets",
            ["categories"] = "ai,education"
        }
    })
    .Build();

// StringArray fields auto-split and do contains-match; undeclared fields use exact-match
var response = index.Search(new HybridQuery
{
    Text = "deep learning",
    MetadataFilters = new Dictionary<string, string> { ["tags"] = "ml" }
});

Azure OpenAI Embeddings

Plug in an embedding provider and Retrievo handles the rest — documents are embedded at build time, queries at search time.

using Retrievo.AzureOpenAI;

var provider = new AzureOpenAIEmbeddingProvider(
    new Uri("https://your-resource.openai.azure.com/"),
    "your-api-key",
    "text-embedding-3-small");

// Documents are auto-embedded during build
using var index = await new HybridSearchIndexBuilder()
    .AddFolder("./docs")  // loads *.md and *.txt recursively
    .WithEmbeddingProvider(provider)
    .BuildAsync();

// Query text is automatically converted to a vector
var response = await index.SearchAsync(new HybridQuery { Text = "how to deploy", TopK = 5 });

Architecture

HybridQuery
    |
    v
+-------------------+
|  HybridSearchIndex |  (orchestrator)
+-------------------+
    |           |
    v           v
+--------+  +----------+
| Lucene |  | Brute-   |
| BM25   |  | Force    |
| Search |  | Cosine   |
+--------+  +----------+
    |           |
    v           v
  ranked      ranked
  list        list
    \         /
     v       v
  +----------+
  | RRF      |
  | Fusion   |
  +----------+
       |
       v
  SearchResponse

Reciprocal Rank Fusion merges multiple ranked lists without score normalization: score(doc) = Σ weight / (k + rank). Documents that rank high on both lexical and vector lists get the biggest boost — surfacing results that are semantically relevant and contain the right keywords.

Benchmarks

Retrieval Quality (NDCG@10)

Validated against BEIR with 245-configuration parameter sweeps per dataset:

Dataset	BM25	Vector-only	Hybrid (default)	Hybrid (tuned)	Anserini BM25
NFCorpus	0.325	0.384	0.392	0.392	0.325
SciFact	0.665	0.731	0.756	0.757	0.679

Default parameters (LexicalWeight=0.5, VectorWeight=1.0, RrfK=20, TitleBoost=0.5) tuned via cross-dataset harmonic mean optimization.

Query Latency

3,000 documents × 768-dimensional embeddings (text-embedding-3-small):

Operation	Latency
Vector-only query	< 5 ms
Lexical-only query	< 5 ms
Hybrid query (BM25 + vector + RRF)	< 10 ms
Index build (3k docs)	< 2 s

Roadmap

Phase	Status	Description
Phase 1	Done	MVP hybrid retrieval, CLI, Azure OpenAI provider
Phase 2	Done	Mutable index, fielded search, filters (exact, range, contains), field definitions, diagnostics
Phase 3	Planned	Snapshot export and import
Phase 4	Planned	ANN support for larger corpora

Build & Test

Requires .NET 8 SDK or later.

dotnet build
dotnet test

238 tests covering retrieval, vector math, fusion, mutable index, filters, field definitions, cancellation, and CLI integration — 0 warnings.

Known Limitations

Lexical (BM25) search is English-only: The lexical retrieval pipeline uses EnglishStemAnalyzer (StandardTokenizer → EnglishPossessiveFilter → LowerCaseFilter → English StopWords → PorterStemmer). Non-English text will not be properly tokenized or stemmed for BM25 matching.
Vector search is language-agnostic: Semantic search works with any language supported by your embedding model (e.g., multilingual embeddings). Hybrid search inherits the English-only limitation for its lexical component.
Brute-force vector search is O(n) per query: Designed for corpora up to ~10k documents. For larger corpora, consider ANN-based solutions (planned for Phase 4).
In-memory only: No persistence or crash recovery. The index must be rebuilt from source documents on each application start.
No concurrent writers on HybridSearchIndex: The immutable index is built once via the builder. Use MutableHybridSearchIndex for incremental upserts and deletes.
Single-process: No distributed or shared index support. The index lives in a single process's memory.
Workaround for non-English corpora: Use vector-only search by omitting lexical configuration, or configure a custom analyzer for your language in a fork.

License

MIT

Product	Compatible and additional computed target framework versions.
.NET	net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net8.0
- Azure.AI.OpenAI (>= 2.1.0)
- Retrievo (>= 0.3.0-preview.1)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
0.3.0-preview.4	31	3/6/2026
0.3.0-preview.3	34	3/5/2026
0.3.0-preview.2	27	3/4/2026
0.3.0-preview.1	31	3/4/2026