HartsyInference.Diffusion 1.0.0-alpha.11

This is a prerelease version of HartsyInference.Diffusion.
There is a newer prerelease version of this package available.
See the version list below for details.
dotnet add package HartsyInference.Diffusion --version 1.0.0-alpha.11
                    
NuGet\Install-Package HartsyInference.Diffusion -Version 1.0.0-alpha.11
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="HartsyInference.Diffusion" Version="1.0.0-alpha.11" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="HartsyInference.Diffusion" Version="1.0.0-alpha.11" />
                    
Directory.Packages.props
<PackageReference Include="HartsyInference.Diffusion" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add HartsyInference.Diffusion --version 1.0.0-alpha.11
                    
#r "nuget: HartsyInference.Diffusion, 1.0.0-alpha.11"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package HartsyInference.Diffusion@1.0.0-alpha.11
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=HartsyInference.Diffusion&version=1.0.0-alpha.11&prerelease
                    
Install as a Cake Addin
#tool nuget:?package=HartsyInference.Diffusion&version=1.0.0-alpha.11&prerelease
                    
Install as a Cake Tool

HartsyInference

A pure C#/.NET AI inference engine for image generation, speech, vision, video, and interactive world models. No Python, no native runtime DLLs.

HartsyInference loads .safetensors and .gguf checkpoints directly and runs them on NVIDIA CUDA, cross-vendor Vulkan, or CPU SIMD, entirely in managed .NET. GPU kernels are PTX/SPIR-V shipped with the package and JIT-compiled at runtime; there are no C++ wrappers, no bundled native inference library, and no external Python process to manage. Just NuGet packages.

It is the non-LLM companion to dotLLM: together they form a complete AI stack in pure .NET.


⚠️ Alpha software

This is 1.0.0-alpha, an early, fast-moving preview. Use it to experiment, not in production.

  • APIs will change without notice between alpha releases. Pin an exact version.
  • Model coverage is broad but maturity varies. Many architectures are implemented and load/run end-to-end but are still being validated numerically against their reference implementations. Treat output quality per-model as "verify before you rely on it."
  • No support guarantees, no semver stability until 1.0.0.
  • The OpenAI-compatible server and CLI are not published as packages in this alpha; they live in the source repository. Publishing them is on the roadmap.

Found a bug or a mismatch against a reference? Please open an issue.


Install

One package pulls in the whole stack (all backends + every modality):

dotnet add package HartsyInference --prerelease

Or reference only the pieces you need (see Packages):

dotnet add package HartsyInference.Audio --prerelease
dotnet add package HartsyInference.Cpu   --prerelease

Requires .NET 8 or .NET 10.


Quick start: speech-to-text

The Whisper pipeline downloads a checkpoint from HuggingFace on first use and runs on whichever backend you pass:

using HartsyInference.Audio.Pipelines;
using HartsyInference.Core.Backends;
using HartsyInference.Cpu;          // or HartsyInference.Cuda / HartsyInference.Vulkan

using WhisperPipeline pipeline = await WhisperPipeline.LoadAsync("openai/whisper-base");
using IBackend backend = new CpuBackend();

WhisperOptions options = new() { Language = "en", WithTimestamps = false };
string text = pipeline.TranscribeWav(backend, "audio.wav", options);

Console.WriteLine(text);

Swap new CpuBackend() for new CudaBackend() or new VulkanBackend(); the pipeline is backend-agnostic. Audio pipelines that auto-download (WhisperPipeline, KokoroPipeline) share this LoadAsync convention, while image and video pipelines (StableDiffusion15Pipeline, WanVideoPipeline) are constructed from pre-loaded components. See the samples in the repo for image, video, and TTS walkthroughs.


What it can do

Modality Highlights
Image generation SD 1.5, SDXL, SD3, Flux.1 / Flux.2, AuraFlow, Chroma, HiDream, Qwen-Image, Lumina 2, OmniGen2, HunyuanImage, Ideogram, Kandinsky 5, and more, with LoRA, img2img, and tiling
Video generation LTX-Video, Wan 2.x, Lance, Kandinsky 5 video
Interactive / world models Matrix-Game 2 & 3, Oasis: action-conditioned, frame-by-frame generation
Speech-to-text Whisper (tiny → large-v3), Moonshine, with streaming and timestamps
Text-to-speech & voice Kokoro, F5-TTS, StyleTTS2, Bark, CosyVoice, Spark-TTS, VibeVoice, CSM
Music ACE-Step, MusicGen, YuE
Vision CLIP & SigLIP embeddings, YOLO detection, SAM segmentation, face detection
3D generation Hunyuan3D-2 (flow-match DiT + ShapeVAE) & TripoSR (feed-forward triplane/NeRF) image to mesh, via marching cubes to glTF/OBJ/PLY

Checkpoints load directly from .safetensors / .gguf, including quantized weights (GGUF, MXFP4/8, NVFP4, block-scaled).

Coverage is wide because the engine shares a common core (tensors, schedulers, VAEs, text encoders, DSP) across architectures. Per-model numerical validation is ongoing; see the alpha note above.


Coming soon

HartsyInference is moving fast, and the roadmap is broad. On deck:

Area Planned
Image ControlNet, IP-Adapter, LCM/Turbo distillation across more architectures, regional prompting
Vision Grounding DINO, YOLO-World, OWLv2, Florence-2, RT-DETR, depth & pose estimation, OCR, tracking
Video HunyuanVideo, CogVideoX, longer-context temporal generation
3D Gaussian-splat output, texture synthesis, multi-view to mesh
World models Broader action spaces, longer memory horizons, multiplayer state
Serving OpenAI-compatible REST server and CLI, published as NuGet packages
Tooling Wider quantized inference (MXFP4 / MXFP8 / NVFP4), model hot-swap, per-modality CLI subcommands

Track progress and releases on the GitHub repo.


Design pillars

Pillar What it means
Pure C# GPU access via PTX (CUDA Driver API) and SPIR-V (Vulkan), with no native shared inference libraries
Eager execution Ops run immediately; no computation graph to compile
Zero-allocation hot paths Tensor storage in NativeMemory.AlignedAlloc; weights memory-mapped; Span<T> throughout
Modular packages Pull in only the modality and backend you need

Packages

Package Description
HartsyInference Meta-package that references everything below
HartsyInference.Core Tensor types, IBackend, schedulers, pipeline base types
HartsyInference.ModelHandler Safetensors/GGUF loaders, quant dequant, HuggingFace download, model registry
HartsyInference.Tokenizers CLIP, T5, Whisper, and LLM-style tokenizers
HartsyInference.Cpu CPU backend with AVX2 / AVX-512 / NEON SIMD kernels
HartsyInference.Cuda CUDA backend with PTX kernels + cuBLAS
HartsyInference.Vulkan Cross-vendor Vulkan backend (NVIDIA / AMD / Intel) via SPIR-V
HartsyInference.Diffusion Image + music diffusion pipelines, VAEs, text encoders, LoRA
HartsyInference.Audio Whisper/Moonshine STT, TTS, voice conversion, music
HartsyInference.Vision CLIP/SigLIP embeddings, YOLO, SAM, face detection
HartsyInference.Video LTX-Video, Wan, Lance, Kandinsky 5 video
HartsyInference.Interactive Action-conditioned world models (Matrix-Game, Oasis)
HartsyInference.ThreeD 3D asset generation: mesh/splat foundation (marching cubes, glTF/OBJ/PLY) + Hunyuan3D-2 image to mesh

Requirements

  • .NET 8 or .NET 10 SDK

CUDA backend (NVIDIA, fastest)

  • CUDA 12.x runtime
  • NVIDIA GPU, compute capability 8.0+ (RTX 30xx/40xx, A100, H100)

Vulkan backend (NVIDIA / AMD / Intel, cross-vendor)

  • Vulkan 1.3+ runtime (ships with the GPU vendor driver)
  • GPU with FP16 compute (shaderFloat16), most discrete GPUs from 2019+

CPU backend

  • Any x86-64 (AVX2+) or ARM64 (NEON) machine, no GPU required


License

MIT © 2026 kalebbroo

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (4)

Showing the top 4 NuGet packages that depend on HartsyInference.Diffusion:

Package Downloads
HartsyInference.Video

Video generation for HartsyInference — LTX-Video, Wan, temporal attention, and video VAE. (Phase 3 — stub)

HartsyInference.Vision

Vision inference for HartsyInference — CLIP embeddings, YOLO detection (planned), SAM segmentation (planned), and face detection (planned).

HartsyInference

Meta-package for HartsyInference — a pure C#/.NET AI inference engine for non-LLM modalities (diffusion image generation, speech-to-text, text-to-speech, vision, video, interactive world models). Adds all backends (CPU, CUDA, Vulkan) and modality packages in one reference.

HartsyInference.ThreeD

3D asset generation for HartsyInference — image/text → mesh and Gaussian-splat. Representation-agnostic foundation (marching cubes, glTF/OBJ/PLY export, 3D sampling) plus model pipelines (Hunyuan3D-2 image→mesh).

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.0-alpha.15 27 6/19/2026
1.0.0-alpha.14 31 6/19/2026
1.0.0-alpha.13 30 6/19/2026
1.0.0-alpha.12 30 6/19/2026
1.0.0-alpha.11 42 6/19/2026
1.0.0-alpha.10 36 6/19/2026
1.0.0-alpha.9 49 6/18/2026
1.0.0-alpha.8 58 6/17/2026
1.0.0-alpha.3 66 6/14/2026