ByTech.BloomFilter 1.0.0

There is a newer version of this package available.
See the version list below for details.
dotnet add package ByTech.BloomFilter --version 1.0.0
                    
NuGet\Install-Package ByTech.BloomFilter -Version 1.0.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="ByTech.BloomFilter" Version="1.0.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="ByTech.BloomFilter" Version="1.0.0" />
                    
Directory.Packages.props
<PackageReference Include="ByTech.BloomFilter" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add ByTech.BloomFilter --version 1.0.0
                    
#r "nuget: ByTech.BloomFilter, 1.0.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package ByTech.BloomFilter@1.0.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=ByTech.BloomFilter&version=1.0.0
                    
Install as a Cake Addin
#tool nuget:?package=ByTech.BloomFilter&version=1.0.0
                    
Install as a Cake Tool

ByTech.BloomFilter

NuGet NuGet Downloads .NET Tests License

A high-performance, benchmark-driven Bloom filter library for .NET 10. Zero-allocation hot path, span-first API, versioned binary serialization, and strong statistical validation.


What is a Bloom filter?

A Bloom filter is a space-efficient probabilistic data structure for membership testing. It can tell you "definitely not in the set" or "possibly in the set" — never the reverse.

  • No false negatives — if an element was added, MayContain always returns true
  • Possible false positivesMayContain may return true for elements never added
  • Space-efficient — uses far less memory than a hash set for large cardinalities
  • O(k) operations — both Add and MayContain are constant-time

Common use cases: cache filtering, duplicate detection, network routing, database query optimization, and pre-screening large datasets.


Quick start

using ByTech.BloomFilter;

// Create a filter for 1 million items with 1% false positive rate
var filter = BloomFilterBuilder
    .ForExpectedInsertions(1_000_000)
    .WithFalsePositiveRate(0.01)
    .Build();

// Add items
filter.Add("hello"u8);
filter.Add("world");

// Query membership
filter.MayContain("hello"u8);  // true  — was added
filter.MayContain("world");    // true  — was added
filter.MayContain("other");    // false — probably not added (or rare false positive)

Features

  • Zero-allocation hot pathAdd and MayContain allocate nothing on the heap
  • Span-first API — primary path is ReadOnlySpan<byte>; string and byte[] overloads delegate to it
  • Double-hashing position derivation — generates k bit positions from two hash values, avoiding k independent hash computations
  • Packed ulong[] bit storage — 64-bit word operations for high throughput
  • Configurable parameters — specify expected insertions and target false positive rate; the library computes optimal m and k
  • Versioned binary serialization — persist and restore filters with round-trip correctness and corruption detection
  • Diagnostics — saturation metrics, popcount, estimated current false positive rate

API

Core operations

public sealed class BloomFilter
{
    // Configuration
    public long ExpectedInsertions { get; }
    public double TargetFalsePositiveRate { get; }
    public long BitCount { get; }
    public int HashFunctionCount { get; }

    // Membership
    public void Add(ReadOnlySpan<byte> value);
    public bool MayContain(ReadOnlySpan<byte> value);

    // String overloads (UTF-8 encoded, delegates to span path)
    public void Add(string value);
    public bool MayContain(string value);

    // Lifecycle
    public void Clear();
    public BloomFilterSnapshot Snapshot();
}

Builder

var filter = BloomFilterBuilder
    .ForExpectedInsertions(10_000_000)
    .WithFalsePositiveRate(0.001)
    .Build();

Serialization

// Save
using var stream = File.Create("filter.bin");
BloomFilterSerializer.WriteTo(filter, stream);

// Load
using var input = File.OpenRead("filter.bin");
var restored = BloomFilterSerializer.ReadFrom(input);

// Membership is preserved
restored.MayContain("hello"u8);  // same result as original

How it works

Parameter calculation

Given expected insertions n and target false positive rate p:

m = -(n × ln(p)) / (ln(2)²)     — optimal bit count
k = (m / n) × ln(2)              — optimal hash function count

Double hashing

Instead of computing k independent hash functions, the library derives two 64-bit hashes (h1, h2) and generates positions:

position(i) = (h1 + i × h2) mod m    for i = 0, 1, ..., k-1

This preserves theoretical false positive guarantees while reducing hash computation cost to O(1).

Bit storage

Bits are packed into a ulong[] array. Set and test operations use word-level indexing and bit masking — no per-operation allocations.


Project structure

ByTech.BloomFilter.sln
├── src/
│   └── ByTech.BloomFilter/              # Core library
│       ├── Configuration/                # Parameter planning, validation
│       ├── Hashing/                      # Hash strategy, double-hashing
│       ├── Storage/                      # Packed ulong[] bit storage
│       ├── Serialization/                # Versioned binary format
│       └── Diagnostics/                  # Saturation metrics, snapshot
├── tests/
│   ├── ByTech.BloomFilter.Tests/         # Unit, integration, property-based, statistical
│   └── ByTech.BloomFilter.FuzzTests/     # Fuzz testing
├── benchmarks/
│   └── ByTech.BloomFilter.Benchmarks/    # BenchmarkDotNet suite
└── .docs/                                # Project documentation

Design principles

Benchmark-driven — architectural decisions (hash strategy, capacity alignment, mapping method) are backed by measured results, not intuition.

Correctness-first — zero false negatives verified by property-based tests. Observed false positive rate validated against theoretical envelope using statistical tests with large sample sizes.

Tiny hot pathAdd and MayContain are small, branch-light, allocation-free, and easy to inspect.

Extension-ready — internal design supports future counting, scalable, and concurrent variants without rewriting the core.


Benchmarks

dotnet run --project benchmarks/ByTech.BloomFilter.Benchmarks -c Release

Benchmark suite includes: add throughput, query throughput (present/absent keys), mixed workloads, and serialization. All hot-path benchmarks confirm zero heap allocations via [MemoryDiagnoser].

See benchmark methodology for environment requirements and interpretation rules.


Thread safety

The v1 BloomFilter class is not thread-safe. If you need concurrent access, provide external synchronization. This is a deliberate design choice — see the concurrency roadmap for future plans.


Limitations

  • Not thread-safe — v1 requires external synchronization for concurrent access
  • No deletion — standard Bloom filter; use a counting variant for deletion support
  • No generic T — v1 supports ReadOnlySpan<byte>, string, and byte[] only
  • Little-endian serialization — binary format assumes little-endian host (standard for all .NET 10 platforms)
  • Maximum capacity — 2^37 bits (~16 GB); exceeding this throws during construction

Version

Current: v1.0.0


License

Copyright 2026 ByTech. Author: Branimir Bajt. Licensed under the Apache License, Version 2.0.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on ByTech.BloomFilter:

Package Downloads
ByTech.LsmTree

A disk-based, persistent Log-Structured Merge-tree for .NET 10. Provides a write-optimized sorted key-value store with WAL-based durability, leveled compaction, bloom-filter accelerated point lookups, and concurrent reads/writes. Keys and values are fully generic with custom codec support.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.2.0 136 4/7/2026
1.0.0 112 4/7/2026

v1.0.0 — Initial release. Core BloomFilter with Add/MayContain, parameter calculator, binary serialization, diagnostics.