Chonkie.Embeddings 0.1.0-preview.87

This is a prerelease version of Chonkie.Embeddings.

dotnet add package Chonkie.Embeddings --version 0.1.0-preview.87

NuGet\Install-Package Chonkie.Embeddings -Version 0.1.0-preview.87

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="Chonkie.Embeddings" Version="0.1.0-preview.87" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="Chonkie.Embeddings" Version="0.1.0-preview.87" />
                    

                            Directory.Packages.props

<PackageReference Include="Chonkie.Embeddings" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add Chonkie.Embeddings --version 0.1.0-preview.87

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: Chonkie.Embeddings, 0.1.0-preview.87"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package Chonkie.Embeddings@0.1.0-preview.87

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=Chonkie.Embeddings&version=0.1.0-preview.87&prerelease
                    

                            Install as a Cake Addin

#tool nuget:?package=Chonkie.Embeddings&version=0.1.0-preview.87&prerelease
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Chonkie.Net - The Lightweight RAG Ingestion Library

Chonkie.Net is an experimental .NET/C# port of Python Chonkie, providing fast, efficient, and robust text chunking for Retrieval-Augmented Generation (RAG) systems. This is an independent port and is not officially affiliated with the original Chonkie project.

Key Features

Fast & Efficient - 10-100x faster than Python implementations
11 Specialized Chunkers - Choose the right chunker for your data type
7 Embedding Providers - OpenAI, Azure, Gemini, Cohere, VoyageAI, Jina, and ONNX local models
9 Vector Database Integrations - Pinecone, Qdrant, Chroma, Weaviate, MongoDB, Pgvector, Elasticsearch, Milvus, Turbopuffer
5 LLM Providers - OpenAI, Azure, Groq, Cerebras, Gemini
ONNX Support - Local embeddings with SentenceTransformers
Complete RAG Pipeline - End-to-end document processing for RAG
No Dependencies Bloat - Minimal, modular architecture
Type-Safe - Full C# 14 nullable reference types support
900+ Tests - Comprehensive unit and integration test suite

Quick Start

Installation

dotnet add package Chonkie.Net

Basic Chunking (30 seconds)

using Chonkie.Chunkers;
using Chonkie.Tokenizers;

// Create a chunker
var chunker = new RecursiveChunker(
    tokenizer: new WordTokenizer(),
    chunkSize: 512
);

// Chunk your text
var text = "Your document here...";
var chunks = chunker.Chunk(text);

// Use the chunks
foreach (var chunk in chunks)
{
    Console.WriteLine($"Text: {chunk.Text}");
    Console.WriteLine($"Tokens: {chunk.TokenCount}");
}

With Embeddings & Vector Database

using Chonkie.Embeddings;
using Chonkie.Handshakes;

// Create embeddings
var embeddings = new OpenAIEmbeddings(
    apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY")!
);

// Create vector database connection
var vectorDb = new PineconeHandshake(
    apiKey: "your-pinecone-key",
    indexName: "my-index",
    embeddingModel: embeddings
);

// Store chunks with embeddings (vectorDb embeds internally)
await vectorDb.WriteAsync(chunks);

Documentation

Quick Start Guide - Get started in 5 minutes
RAG System Tutorial - Build a complete RAG system
Chunker Selection Guide - Choose the right chunker
Vector Database Integration - Connect to any vector DB
Python Migration Guide - Coming from Python Chonkie?

Chunkers (11 Types)

Chunker	Best For	Speed
TokenChunker	Simple, fast splitting	⚡⚡⚡
RecursiveChunker	Natural documents (RECOMMENDED)	⚡⚡
SentenceChunker	Sentence boundaries	⚡⚡
SemanticChunker	Meaning-aware grouping	⚡
CodeChunker	Source code	⚡⚡
TableChunker	Structured data	⚡⚡
MarkdownChunker	Markdown documents	⚡⚡
LateChunker	Two-stage processing	⚡
NeuralChunker	ONNX embeddings	⚡
SlumberChunker	Complex documents	⚡
FastChunker	High-speed splitting	⚡⚡⚡

Embeddings (7 Providers)

OpenAI
Azure OpenAI
Google Gemini
Cohere
VoyageAI
Jina
Local ONNX (SentenceTransformers)

LLM Providers (5 Types)

OpenAI
Azure OpenAI
Groq (fast inference)
Cerebras (ultra-fast)
Google Gemini

Vector Databases (9 Integrations)

Pinecone - Fully managed serverless
Qdrant - Open-source vector search
Chroma - Lightweight local embedding DB
Weaviate - Open-source, flexible
MongoDB - MongoDB Atlas Vector Search
PostgreSQL - pgvector extension
Elasticsearch - Search-optimized
Milvus - High-performance distributed
Turbopuffer - Real-time, edge-optimized

Common Use Cases

1. Document Ingestion for RAG

// Chunk documents, embed, and store in vector DB
var chunks = chunker.Chunk(document);
await vectorDb.WriteAsync(chunks);

2. Code Analysis

var codeChunker = new CodeChunker(
    tokenizer: new WordTokenizer(),
    chunkSize: 1024
);
var chunks = codeChunker.Chunk(sourceCode);

3. Semantic Search

var semanticChunker = new SemanticChunker(
    tokenizer: new WordTokenizer(),
    embeddingModel: embeddings,
    threshold: 0.5f
);
var chunks = semanticChunker.Chunk(text);
// Chunks grouped by semantic meaning

4. RAG Pipeline

var pipeline = new Pipeline()
    .ProcessWith("text")
    .ChunkWith("recursive", new { chunk_size = 1024 })
    .RunAsync(texts: documentText);

Why Chonkie.Net?

✅ Type Safety - Full C# 14 support
✅ Almost Production Ready - 900+ tests, zero warnings
✅ Extensively Documented - Tutorials and guides
✅ Complete Features - Feature parity with Python Chonkie, all major RAG components included

Minimum Requirements

.NET 10.0 or higher
C# 14 features enabled
Windows, Linux, or macOS

Contributing

Contributions are welcome! Please visit GitHub Repository.

License

Licensed under Apache License 2.0. See LICENSE for details.

Learn More

Official Repo: https://github.com/gianni-rg/Chonkie.Net
Python Chonkie: https://github.com/chonkie-inc/chonkie
Documentation: Check the /docs folder in the repository

Product	Compatible and additional computed target framework versions.
.NET	net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net10.0
- Azure.AI.OpenAI (>= 2.8.0-beta.1)
- Chonkie.Core (>= 0.1.0-preview.87)
- Microsoft.Extensions.AI (>= 10.2.0)
- Microsoft.Extensions.AI.Abstractions (>= 10.2.0)
- Microsoft.Extensions.AI.OpenAI (>= 10.2.0-preview.1.26063.2)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.2)
- Microsoft.ML.OnnxRuntime (>= 1.24.1)
- Microsoft.ML.Tokenizers (>= 2.0.0)
- OllamaSharp (>= 5.4.16)
- OpenAI (>= 2.8.0)
- System.Numerics.Tensors (>= 10.0.2)

NuGet packages (5)

Showing the top 5 NuGet packages that depend on Chonkie.Embeddings:

Package	Downloads
Chonkie.Net Meta-package that depends on all Chonkie.Net libraries.	179
Chonkie.Handshakes The lightweight ingestion library for fast, efficient and robust RAG pipelines. Chonkie.Net provides production-ready chunkers, embeddings, vector database integrations, and complete RAG system support.	83
Chonkie.Refineries The lightweight ingestion library for fast, efficient and robust RAG pipelines. Chonkie.Net provides production-ready chunkers, embeddings, vector database integrations, and complete RAG system support.	82
Chonkie.Pipeline The lightweight ingestion library for fast, efficient and robust RAG pipelines. Chonkie.Net provides production-ready chunkers, embeddings, vector database integrations, and complete RAG system support.	81
Chonkie.Chunkers The lightweight ingestion library for fast, efficient and robust RAG pipelines. Chonkie.Net provides production-ready chunkers, embeddings, vector database integrations, and complete RAG system support.	80

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
0.1.0-preview.87	86	2/16/2026