TinyTokenizer 0.1.0
There is a newer version of this package available.
See the version list below for details.
See the version list below for details.
dotnet add package TinyTokenizer --version 0.1.0
NuGet\Install-Package TinyTokenizer -Version 0.1.0
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="TinyTokenizer" Version="0.1.0" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="TinyTokenizer" Version="0.1.0" />
<PackageReference Include="TinyTokenizer" />
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add TinyTokenizer --version 0.1.0
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: TinyTokenizer, 0.1.0"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package TinyTokenizer@0.1.0
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=TinyTokenizer&version=0.1.0
#tool nuget:?package=TinyTokenizer&version=0.1.0
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
TinyTokenizer
A high-performance, zero-allocation tokenizer library for .NET that parses text into abstract tokens using ReadOnlySpan<char> for maximum efficiency.
Features
- Zero-allocation parsing — Uses
ReadOnlySpan<char>internally for fast, allocation-free text traversal - Recursive declaration blocks — Automatically parses nested
{},[], and()blocks with child tokens - Configurable symbols — Define which characters are recognized as symbol tokens
- Immutable tokens — All token types are immutable record classes
- Error recovery — Gracefully handles malformed input with
ErrorTokenand continues parsing
Installation
Add a reference to the TinyTokenizer project or include the source files in your solution.
Quick Start
using TinyTokenizer;
// Create a tokenizer with source text
var tokenizer = new Tokenizer("func(a, b)".AsMemory());
var tokens = tokenizer.Tokenize();
// tokens contains:
// - TextToken("func")
// - BlockToken("(a, b)") with children:
// - TextToken("a")
// - SymbolToken(",")
// - WhitespaceToken(" ")
// - TextToken("b")
Token Types
| Type | Description | Example |
|---|---|---|
TextToken |
Plain text content | hello, func, 123 |
WhitespaceToken |
Spaces, tabs, newlines | , \t, \n |
SymbolToken |
Configurable symbol characters | /, :, ,, ; |
BlockToken |
Declaration blocks with delimiters | {...}, [...], (...) |
ErrorToken |
Parsing errors (unmatched delimiters) | } without opening { |
BlockToken Properties
var tokenizer = new Tokenizer("{inner content}".AsMemory());
var tokens = tokenizer.Tokenize();
var block = (BlockToken)tokens[0];
block.FullContent; // "{inner content}" (includes delimiters)
block.InnerContent; // "inner content" (excludes delimiters)
block.Children; // ImmutableArray<Token> of parsed inner tokens
block.OpeningDelimiter; // '{'
block.ClosingDelimiter; // '}'
block.Type; // TokenType.BraceBlock
Configuration
Customize the tokenizer with TokenizerOptions:
// Default symbols: / : , ; = + - * < > ! & | . @ # ? % ^ ~ \
var options = TokenizerOptions.Default;
// Add custom symbols
options = TokenizerOptions.Default.WithAdditionalSymbols('$', '_');
// Remove symbols (they become part of text tokens)
options = TokenizerOptions.Default.WithoutSymbols('/');
// Replace entire symbol set
options = TokenizerOptions.Default.WithSymbols(':', ',', ';');
// Use with tokenizer
var tokenizer = new Tokenizer(source.AsMemory(), options);
Nested Blocks
Declaration blocks are parsed recursively:
var tokenizer = new Tokenizer("{outer [inner (deepest)]}".AsMemory());
var tokens = tokenizer.Tokenize();
var braceBlock = (BlockToken)tokens[0]; // {outer [inner (deepest)]}
var bracketBlock = (BlockToken)braceBlock.Children[2]; // [inner (deepest)]
var parenBlock = (BlockToken)bracketBlock.Children[2]; // (deepest)
Error Handling
The tokenizer produces ErrorToken for malformed input and continues parsing:
var tokenizer = new Tokenizer("}hello{".AsMemory());
var tokens = tokenizer.Tokenize();
// tokens contains:
// - ErrorToken("}", "Unexpected closing delimiter '}'", position: 0)
// - TextToken("hello")
// - ErrorToken("{", "Unclosed block starting with '{'", position: 6)
// Check for errors
if (tokens.HasErrors())
{
foreach (var error in tokens.GetErrors())
{
Console.WriteLine($"Error at {error.Position}: {error.ErrorMessage}");
}
}
Utility Extensions
Extensions on ImmutableArray<Token> for common operations:
// Check if any errors exist (including nested)
bool hasErrors = tokens.HasErrors();
// Get all errors (including nested)
IEnumerable<ErrorToken> errors = tokens.GetErrors();
// Get all tokens of a specific type (including nested)
IEnumerable<TextToken> textTokens = tokens.OfTokenType<TextToken>();
IEnumerable<BlockToken> blocks = tokens.OfTokenType<BlockToken>();
API Reference
Tokenizer (ref struct)
// Constructor
public Tokenizer(ReadOnlyMemory<char> source, TokenizerOptions? options = null)
// Tokenize the source
public ImmutableArray<Token> Tokenize()
Token (abstract record)
public abstract record Token(ReadOnlyMemory<char> Content, TokenType Type)
{
public ReadOnlySpan<char> ContentSpan { get; }
}
TokenType (enum)
public enum TokenType
{
BraceBlock, // { }
BracketBlock, // [ ]
ParenthesisBlock, // ( )
Symbol, // configurable characters
Text, // plain text
Whitespace, // spaces, tabs, newlines
Error // parsing errors
}
Requirements
- .NET 8.0 or later
License
MIT
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
-
net8.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated | |
|---|---|---|---|
| 0.10.0 | 149 | 1/7/2026 | |
| 0.9.0 | 123 | 1/5/2026 | |
| 0.8.0 | 131 | 1/4/2026 | |
| 0.7.0 | 127 | 1/3/2026 | |
| 0.6.8 | 122 | 1/2/2026 | |
| 0.6.7 | 123 | 1/2/2026 | |
| 0.6.6 | 121 | 1/2/2026 | |
| 0.6.5 | 124 | 1/1/2026 | |
| 0.6.4 | 124 | 1/1/2026 | |
| 0.6.3 | 120 | 1/1/2026 | |
| 0.6.2 | 126 | 1/1/2026 | |
| 0.6.1 | 119 | 12/31/2025 | |
| 0.6.0 | 125 | 12/31/2025 | |
| 0.5.1 | 122 | 12/31/2025 | |
| 0.5.0 | 126 | 12/30/2025 | |
| 0.4.1 | 116 | 12/29/2025 | |
| 0.4.0 | 113 | 12/29/2025 | |
| 0.3.0 | 123 | 12/27/2025 | |
| 0.2.0 | 189 | 12/26/2025 | |
| 0.1.0 | 200 | 12/25/2025 |