Iciclecreek.Lucene.Net.Vector 2.0.2

dotnet add package Iciclecreek.Lucene.Net.Vector --version 2.0.2
                    
NuGet\Install-Package Iciclecreek.Lucene.Net.Vector -Version 2.0.2
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Iciclecreek.Lucene.Net.Vector" Version="2.0.2" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Iciclecreek.Lucene.Net.Vector" Version="2.0.2" />
                    
Directory.Packages.props
<PackageReference Include="Iciclecreek.Lucene.Net.Vector" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Iciclecreek.Lucene.Net.Vector --version 2.0.2
                    
#r "nuget: Iciclecreek.Lucene.Net.Vector, 2.0.2"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Iciclecreek.Lucene.Net.Vector@2.0.2
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Iciclecreek.Lucene.Net.Vector&version=2.0.2
                    
Install as a Cake Addin
#tool nuget:?package=Iciclecreek.Lucene.Net.Vector&version=2.0.2
                    
Install as a Cake Tool

logo

Build and Tests NuGet

Iciclecreek.Lucene.Net.Vector

Adds vector similarity search to Lucene.Net 4.8.

Embeddings are stored as BinaryDocValuesField and queried through standard Lucene Query classes that compose with BooleanQuery for filtered and hybrid search.

The library multi-targets netstandard2.0, net8.0, and net10.0:

  • .NET 10+: uses HNSW approximate nearest neighbor search (fast, via KnnVectorQuery)
  • All runtimes: brute-force cosine similarity with SIMD acceleration (exact, via CosineVectorQuery)
  • VectorQuery: smart wrapper that picks the best implementation automatically

Installation

dotnet add package Iciclecreek.Lucene.Net.Vector

Usage

Indexing vectors

Store embeddings alongside your documents using BinaryDocValuesField:

using Iciclecreek.Lucene.Net.Vector;
using Lucene.Net.Documents;
using Lucene.Net.Index;

var doc = new Document
{
    new StringField("id", "1", Field.Store.YES),
    new TextField("text", "The cat sat on the mat", Field.Store.YES),
    new StringField("category", "animals", Field.Store.YES),
    new BinaryDocValuesField("embedding", VectorSerializer.ToBytesRef(vector)),
};
writer.AddDocument(doc);
writer.Commit();

VectorQuery is the recommended entry point -- it selects HNSW or brute-force automatically:

using var reader = DirectoryReader.Open(directory);
var searcher = new IndexSearcher(reader);
var query = new VectorQuery("embedding", queryVector, k: 10, reader);
var topDocs = searcher.Search(query, 10);

foreach (var scoreDoc in topDocs.ScoreDocs)
{
    var doc = searcher.Doc(scoreDoc.Doc);
    Console.WriteLine($"{doc.Get("id")}: {scoreDoc.Score}");
}

Query classes

Class Runtime Algorithm Use case
VectorQuery All Auto-selects between CosineVectorQuery and KnnVectorQuery for runtime Top-K nearest neighbor search
CosineVectorQuery All Brute-force cosine similarity (SIMD) Top-K nearest neighbor search
KnnVectorQuery .NET 10+ HNSW approximate nearest neighbor Top-K nearest neighbor search
VectorScoreQuery All Cosine re-scoring via CustomScoreQuery Re-rank results from any Lucene query

The first three find the K most similar vectors and return them as results. VectorScoreQuery does the opposite — it takes an existing query that controls which documents match, and replaces the score with cosine similarity. This is useful when you already have a filter or full-text query selecting documents and want to rank them by vector similarity without a separate top-K pass.

All four extend Lucene.Net.Search.Query and compose naturally with BooleanQuery.

Compose any vector query with BooleanQuery to combine similarity with field filters:

var boolQuery = new BooleanQuery
{
    { new TermQuery(new Term("category", "animals")), Occur.MUST },
    { new VectorQuery("embedding", queryVector, k: 10, reader), Occur.MUST },
};
var topDocs = searcher.Search(boolQuery, 10);

Vector re-scoring

VectorScoreQuery re-ranks results from any Lucene query by cosine similarity. The sub-query controls which documents match; the vector score controls ranking:

// Re-rank full-text search results by vector similarity
var query = new VectorScoreQuery(
    new TermQuery(new Term("category", "animals")),  // filter: which docs match
    "embedding",                                      // vector field name
    queryVector);                                     // query vector

var topDocs = searcher.Search(query, 10);
// Results are filtered to "animals" category, ranked by cosine similarity

This is useful when you want Lucene's standard query engine to handle filtering (full-text, term, range, boolean) and just need vector similarity for ranking — no top-K limit, no HNSW index.

After updates

When you add, update, or delete documents, open a new reader -- the HNSW index rebuilds automatically:

writer.DeleteDocuments(new Term("id", "old-doc"));
writer.Commit();

using var newReader = DirectoryReader.Open(directory);
var searcher = new IndexSearcher(newReader);
var query = new VectorQuery("embedding", queryVector, k: 10, newReader);

HNSW options (.NET 10+)

Pass VectorIndexOptions to tune the HNSW graph when using KnnVectorQuery or VectorQuery:

var options = new VectorIndexOptions
{
    M = 16,                                        // Max edges per node (default: 16)
    ConstructionPruning = 200,                     // Build quality (default: 200)
    EfSearch = 50,                                 // Search quality (default: 50)
    Distance = VectorDistanceFunction.Cosine,      // Distance function (default: Cosine)
};
var query = new VectorQuery("embedding", queryVector, k: 10, reader, options);

On runtimes without HNSW, options are accepted but ignored -- CosineVectorQuery is used instead.

RAG (Retrieval-Augmented Generation)

A typical RAG pipeline: embed your documents at index time, then at query time embed the user's question, retrieve relevant context via vector search, and pass it to an LLM.

// --- Setup: any IEmbeddingGenerator (OpenAI, Ollama, local ONNX, etc.) ---
IEmbeddingGenerator<string, Embedding<float>> embedder = ...;
IChatClient chatClient = ...;

// --- Index time: embed and store documents ---
using var directory = FSDirectory.Open("my-index");
using var analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48);
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer);
using var writer = new IndexWriter(directory, config);

foreach (var chunk in documentChunks)
{
    var embedding = await embedder.GenerateAsync([chunk.Text]);
    writer.AddDocument(new Document
    {
        new StringField("id", chunk.Id, Field.Store.YES),
        new StoredField("text", chunk.Text),
        new BinaryDocValuesField("embedding",
            VectorSerializer.ToBytesRef(embedding[0].Vector.ToArray())),
    });
}
writer.Commit();

// --- Query time: embed question, retrieve context, generate answer ---
var question = "How does photosynthesis work?";
var questionEmbedding = await embedder.GenerateAsync([question]);
var queryVector = questionEmbedding[0].Vector.ToArray();

using var reader = DirectoryReader.Open(directory);
var searcher = new IndexSearcher(reader);
var query = new VectorQuery("embedding", queryVector, k: 5, reader);
var topDocs = searcher.Search(query, 5);

// Gather context from top matches
var context = string.Join("\n\n", topDocs.ScoreDocs
    .Select(sd => searcher.Doc(sd.Doc).Get("text")));

// Pass to LLM
var response = await chatClient.GetResponseAsync($"""
    Answer the question based on the following context:

    {context}

    Question: {question}
    """);

Console.WriteLine(response);

This composes with Lucene's full query model — you can add metadata filters, full-text search, or boost certain fields alongside the vector search:

// RAG with metadata filter: only search recent documents
var filteredRag = new BooleanQuery
{
    { NumericRangeQuery.NewInt64Range("timestamp", recentCutoff, null, true, false), Occur.MUST },
    { new VectorQuery("embedding", queryVector, k: 10, reader), Occur.MUST },
};

Utilities

  • VectorSerializer — converts between float[] and Lucene's BytesRef/byte[] for storage
  • VectorMath — SIMD-accelerated cosine similarity and vector norm utilities

Notes

  • Vectors are stored in Lucene as BinaryDocValuesField (4 bytes per float, little-endian). The HNSW graph is an in-memory acceleration structure built automatically from DocValues.
  • HNSW caching — the graph is cached per (IndexReader, fieldName). Since an IndexReader is an immutable snapshot, the cached graph is valid for the reader's entire lifetime. Opening a new reader after commits triggers a fresh build.
  • Deleted documents are automatically excluded — Lucene's LiveDocs filtering ensures deleted docs are not loaded into the HNSW graph or evaluated by brute-force search.
  • Distance functions: Cosine (general purpose, handles unnormalized vectors) and CosineForUnits (faster, requires pre-normalized unit vectors). These apply to HNSW only; CosineVectorQuery always uses cosine similarity.

License

MIT

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on Iciclecreek.Lucene.Net.Vector:

Package Downloads
Iciclecreek.Lucene.Net.Linq

Port of Lucene.Net.Linq to .NET core and Lucene 4.x. Execute LINQ queries on Lucene.Net complete with object to Document mapping

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
2.0.2 269 4/22/2026
2.0.1 110 4/22/2026
2.0.0 95 4/21/2026
1.0.2 99 4/21/2026
1.0.0 94 4/20/2026