HnswLite.RamStorage 1.0.2

dotnet add package HnswLite.RamStorage --version 1.0.2
                    
NuGet\Install-Package HnswLite.RamStorage -Version 1.0.2
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="HnswLite.RamStorage" Version="1.0.2" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="HnswLite.RamStorage" Version="1.0.2" />
                    
Directory.Packages.props
<PackageReference Include="HnswLite.RamStorage" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add HnswLite.RamStorage --version 1.0.2
                    
#r "nuget: HnswLite.RamStorage, 1.0.2"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package HnswLite.RamStorage@1.0.2
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=HnswLite.RamStorage&version=1.0.2
                    
Install as a Cake Addin
#tool nuget:?package=HnswLite.RamStorage&version=1.0.2
                    
Install as a Cake Tool

<img src="https://github.com/jchristn/HnswLite/blob/main/assets/logo.png" width="256" height="256">

HnswLite

A pure C# implementation of Hierarchical Navigable Small World (HNSW) graphs for approximate nearest neighbor search. This library provides a thread-safe, embeddable solution for vector similarity search in .NET applications.

Note: This library is in its early stages of development. We welcome your patience, constructive feedback, and contributions! Please be kind and considerate when reporting issues or suggesting improvements. I am not an expert on this topic and relied heavily on available AI tools to build this library. Pull requests are greatly appreciated!

NuGet Version NuGet

Overview

HnswLite implements the Hierarchical Navigable Small World algorithm, which provides fast approximate nearest neighbor search with excellent recall rates. The library is designed to be embeddable, extensible, and easy to use in any .NET application.

Key Features

  • Pure C# implementation - No native dependencies
  • Thread-safe - Safe for concurrent operations
  • Async/await support - Modern async APIs with cancellation tokens
  • Pluggable storage - Extensible storage backend interface
  • Multiple distance metrics - Euclidean, Cosine, and Dot Product
  • Flexible deployment - Support for RAM or Sqlite, or build-your-own backend
  • Batch operations - Efficient bulk insert and remove
  • Comprehensive validation - Input validation and error handling

New in v1.0.x

Initial version featuring:

  • Core HNSW algorithm implementation
  • In-memory storage backend
  • Sqlite storage backend
  • Async APIs with cancellation support
  • Three distance functions (Euclidean, Cosine, Dot Product)
  • Batch add/remove operations
  • State export/import functionality
  • Thread-safety

For version history, see CHANGELOG.md.

Use Cases

HnswLite is ideal for:

  • Semantic Search - Find similar documents, sentences, or paragraphs based on embeddings
  • Recommendation Systems - Discover similar items, users, or content
  • Image Similarity - Search for visually similar images using feature vectors
  • Anomaly Detection - Identify outliers by finding distant neighbors
  • Clustering - Group similar items together based on vector proximity
  • RAG Applications - Retrieval-Augmented Generation for LLM applications
  • Duplicate Detection - Find near-duplicate content in large datasets

Performance and Scalability Recommendations

  • Vector dimensions: 50-1000 (optimal: 128-768)
  • Dataset size: Up to 1-10M vectors (depending on dimensions and available RAM)
  • Memory usage: Approximately (vector_count * dimension * 4 bytes) + (vector_count * M * 32 bytes)

Note: the above are estimations. This library has not been tested (yet) at any level of scale.

Performance Characteristics

  • Index build time: O(N log N) expected
  • Search time: O(log N) expected
  • Memory complexity: O(N * M) where M is the connectivity parameter

Optimization Tips

  1. Parameter Tuning:
  • M: Number of connections per vector (default: 16). Think of this as how many "friends" each vector has in the network. More connections mean better search quality but use more memory. For most cases, 16-32 works well.
  • EfConstruction: Size of the candidate list when building the index (default: 200). This controls how thoroughly the algorithm searches for connections when adding new vectors. Higher values create better quality indices but take longer to build.
  • Ef (search parameter): Size of the candidate list during search (default: 50-200). This controls how many paths the algorithm explores when searching. Higher values find better results but take more time. Set this based on your speed/quality needs.
  1. For Large Datasets:
  • Leverage the Sqlite implementation or build your own disk-based storage backend
  • Use batch operations for initial index building
  • Monitor memory usage closely
  1. For High-Dimensional Data:
  • Consider dimensionality reduction before indexing
  • Use cosine distance for normalized embeddings

Bugs, Feedback, or Enhancement Requests

We value your input! If you encounter any issues or have suggestions:

  • Bug Reports: Please file an issue with reproduction steps
  • Feature Requests: Start a discussion or create an issue
  • Questions: Use the discussions forum for general questions
  • Contributions: Pull requests are welcome! Please read our contributing guidelines first

Simple Example

using Hnsw;
using HnswIndex.RamStorage;
using HnswIndex.SqliteStorage;

// Create an index for 128-dimensional vectors in RAM
var index = new HnswIndex(128, new RamHnswStorage(), new RamHnswLayerStorage());

// Or using Sqlite
var sqliteStorage = new SqliteHnswStorage("my-index.db");
var sqliteLayerStorage = new SqliteHnswLayerStorage(sqliteStorage.Connection);
var sqliteIndex = new HnswIndex(2, sqliteStorage, sqliteLayerStorage);

// Configure parameters (optional)
index.M = 16;
index.EfConstruction = 200;
index.DistanceFunction = new CosineDistance();

// Add vectors to the index
var vectorId = Guid.NewGuid();
var vector = new List<float>(128); // Your 128-dimensional embedding
// ... populate vector with data ...

await index.AddAsync(vectorId, vector);

// Add multiple vectors
var batch = new List<(Guid id, List<float> vector)>();
for (int i = 0; i < 1000; i++)
{
    var id = Guid.NewGuid();
    var v = GenerateRandomVector(128); // Your vector generation logic
    batch.Add((id, v));
}
await index.AddBatchAsync(batch);

// Search for nearest neighbors
var queryVector = new List<float>(128); // Your query embedding
// ... populate query vector ...

var neighbors = await index.GetTopKAsync(queryVector, k: 10);

foreach (var result in neighbors)
{
    Console.WriteLine($"ID: {result.GUID}, Distance: {result.Distance:F4}");
}

// Save the index
var state = await index.ExportStateAsync();
// ... serialize state to disk ...

// Load the index
var newIndex = new HnswLite(dimension: 128);
await newIndex.ImportStateAsync(state);

Custom Storage Example

Refer to HnswIndex.RamStorage and HnswIndex.SqliteStorage for actual implementations. To implement your own backend, you need to implement:

  • IHnswLayerStorage
  • IHnswNode
  • IHnswStorage

License

This library is available under the MIT license.

Acknowledgments

This implementation is based on the paper: Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs by Yu. A. Malkov and D. A. Yashunin.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.2 363 7/21/2025
1.0.1 269 7/21/2025
1.0.0 270 7/21/2025

Initial release