ElBruno.QwenTTS 1.4.7

dotnet add package ElBruno.QwenTTS --version 1.4.7
                    
NuGet\Install-Package ElBruno.QwenTTS -Version 1.4.7
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="ElBruno.QwenTTS" Version="1.4.7" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="ElBruno.QwenTTS" Version="1.4.7" />
                    
Directory.Packages.props
<PackageReference Include="ElBruno.QwenTTS" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add ElBruno.QwenTTS --version 1.4.7
                    
#r "nuget: ElBruno.QwenTTS, 1.4.7"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package ElBruno.QwenTTS@1.4.7
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=ElBruno.QwenTTS&version=1.4.7
                    
Install as a Cake Addin
#tool nuget:?package=ElBruno.QwenTTS&version=1.4.7
                    
Install as a Cake Tool

Qwen3-TTS ONNX Pipeline + C# .NET

NuGet NuGet Downloads NuGet VoiceCloning Build Status License: MIT GitHub stars Twitter Follow

Run Qwen3-TTS text-to-speech locally from C# using ONNX Runtime — no Python needed at inference time. Models are downloaded automatically on first run.

Pre-exported ONNX models are hosted on HuggingFace: elbruno/Qwen3-TTS-12Hz-0.6B-CustomVoice-ONNX (0.6B preset voices) | elbruno/Qwen3-TTS-12Hz-1.7B-CustomVoice-ONNX (1.7B preset voices + instruct) | elbruno/Qwen3-TTS-12Hz-0.6B-Base-ONNX (voice cloning)

Features

  • Local TTS Inference — Run Qwen3-TTS entirely on your machine using ONNX Runtime
  • Multi-Model Support — Choose between 0.6B (lightweight) and 1.7B (advanced instruct control) variants
  • Automatic Model Download — Models download from HuggingFace on first run (~5.5 GB for 0.6B, ~10 GB for 1.7B)
  • Instruct Control — Natural-language style control with 1.7B model (e.g., "speak with excitement", "whisper softly")
  • Multi-Speaker — 9 built-in voices: ryan, serena, vivian, aiden, eric, dylan, uncle_fu, ono_anna, sohee
  • Voice Cloning — Clone any voice from a 3-second audio sample (docs)
  • Web UI — Blazor app with TTS generation and voice cloning pages (docs)
  • GPU Acceleration — Optional CUDA or DirectML support via SessionOptions injection (docs)
  • Multi-Language — English, Spanish, Chinese, Japanese, Korean
  • Shared Model Cache — Models stored once in %LOCALAPPDATA%/ElBruno/QwenTTS, shared across all apps
  • 24 kHz WAV Output — High-quality mono audio

Quick Start

Install via NuGet

dotnet add package ElBruno.QwenTTS

Generate speech in C#

using ElBruno.QwenTTS.Pipeline;

// 0.6B model (default) — models download automatically (~5.5 GB)
using var pipeline = await TtsPipeline.CreateAsync("models");
await pipeline.SynthesizeAsync("Hello world!", "ryan", "hello.wav", "english");

// 1.7B model — supports instruct control (~10 GB)
using var pipeline17 = await TtsPipeline.CreateAsync("models", variant: QwenModelVariant.Qwen17B);
await pipeline17.SynthesizeAsync("Hello world!", "ryan", "hello.wav", "english",
    instruct: "speak with warmth and excitement");

CLI

# Default (0.6B model)
dotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text "Hello, this is a test." --speaker ryan --language english --output hello.wav

# 1.7B model with instruct control
dotnet run --project src/ElBruno.QwenTTS -- --model-dir models --variant 1.7b --text "Hello, this is a test." --speaker ryan --instruct "speak with excitement" --output hello.wav

Models are downloaded automatically if not present in the --model-dir directory.

Voice Cloning

Clone any voice from a 3-second audio sample using the ElBruno.QwenTTS.VoiceCloning package:

dotnet add package ElBruno.QwenTTS.VoiceCloning
using ElBruno.QwenTTS.VoiceCloning.Pipeline;

var cloner = await VoiceClonePipeline.CreateAsync();
await cloner.SynthesizeAsync("Hello world!", "reference_speaker.wav", "output.wav", "english");

See docs/voice-cloning.md for full documentation.

GPU Acceleration

Pass a sessionOptionsFactory to use CUDA or DirectML instead of CPU:

using ElBruno.QwenTTS.Pipeline;

// CUDA (NVIDIA) — requires Microsoft.ML.OnnxRuntime.Gpu NuGet package
var tts = await TtsPipeline.CreateAsync(
    sessionOptionsFactory: OrtSessionHelper.CreateCudaOptions);

// DirectML (any GPU on Windows) — requires Microsoft.ML.OnnxRuntime.DirectML NuGet package
// Uses GPU for language model, CPU for vocoder (hybrid mode)
var tts = await TtsPipeline.CreateAsync(
    sessionOptionsFactory: OrtSessionHelper.CreateDirectMlOptions,
    vocoderSessionOptionsFactory: OrtSessionHelper.CreateCpuOptions);

See docs/gpu-acceleration.md for full setup instructions.

More Examples

dotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text "Welcome to the future of speech synthesis." --speaker serena --output welcome.wav
dotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text "Speaking with excitement and energy!" --speaker aiden --variant 1.7b --instruct "speak with excitement" --output excited.wav
dotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text "A calm and gentle narration." --speaker ryan --variant 1.7b --instruct "speak slowly and calmly" --output calm.wav

Spanish Examples

dotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text "Hola, esta es una prueba de texto a voz." --speaker ryan --language spanish --output hola.wav
dotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text "Bienvenidos al futuro de la sintesis de voz." --speaker serena --language spanish --output bienvenidos.wav

File Reader (batch audio from text/SRT files)

dotnet run --project src/ElBruno.QwenTTS.FileReader -- --model-dir models --input samples/hello_demo.txt --speaker ryan --language english --output-dir output/hello
dotnet run --project src/ElBruno.QwenTTS.FileReader -- --model-dir models --input samples/demo_subtitles.srt --speaker serena --output-dir output/subtitles

Web App (browser UI)

dotnet run --project src/ElBruno.QwenTTS.Web

Open http://localhost:5153 — two pages:

  • 🔊 TTS — type text or upload files, pick a voice, and generate speech
  • 🎭 Voice Clone — record your voice or upload a WAV, then synthesize with your cloned voice

Documentation

Document Description
Prerequisites System requirements (.NET 8+/10, disk space)
Getting Started Setup, auto-download, and first run
Core Library ElBruno.QwenTTS API reference and usage examples
CLI Reference All command options, speakers, and examples
File Reader Batch audio generation from text and SRT files
Web App Blazor web UI for speech generation
Architecture Pipeline design, model components, project structure
Exporting Models Re-exporting ONNX models from PyTorch weights
Voice Cloning Clone any voice from a 3-second reference audio
GPU Acceleration CUDA, DirectML, and CPU configuration
Troubleshooting Common issues and fixes
Detailed Architecture Full tensor shapes, KV-cache, codebook structure
Changelog Versioned summary of notable changes

Python Tools

The python/ directory contains tools for exporting ONNX models from PyTorch weights and downloading models from HuggingFace. These are only needed if you want to re-export or customize models — they are not required for running the C# pipeline.


Building from Source

git clone https://github.com/elbruno/ElBruno.QwenTTS.git
cd ElBruno.QwenTTS
dotnet build
dotnet test

Requirements

  • .NET 8.0 or .NET 10.0 SDK
  • ONNX Runtime compatible platform (Windows, Linux, macOS)
  • ~5.5 GB disk space for model files

Contributing

Contributions are welcome! Here's how to get started:

  1. Fork the repository
  2. Create a branch for your feature or fix: git checkout -b feature/my-feature
  3. Make your changes and ensure the solution builds: dotnet build
  4. Run tests: dotnet test
  5. Submit a pull request with a clear description of the changes

Please open an issue first for major changes or new features to discuss the approach.


References


👋 About the Author

Hi! I'm ElBruno 🧡, a passionate developer and content creator exploring AI, .NET, and modern development practices.

Made with ❤️ by ElBruno

If you like this project, consider following my work across platforms:

  • 📻 Podcast: No Tienen Nombre — Spanish-language episodes on AI, development, and tech culture
  • 💻 Blog: ElBruno.com — Deep dives on embeddings, RAG, .NET, and local AI
  • 📺 YouTube: youtube.com/elbruno — Demos, tutorials, and live coding
  • 🔗 LinkedIn: @elbruno — Professional updates and insights
  • 𝕏 Twitter: @elbruno — Quick tips, releases, and tech news

License

This project is licensed under the MIT License — see the LICENSE file for details.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (2)

Showing the top 2 NuGet packages that depend on ElBruno.QwenTTS:

Package Downloads
ElBruno.QwenTTS.VoiceCloning

Voice cloning extension for ElBruno.QwenTTS. Clone any voice from a 3-second audio sample using the Qwen3-TTS Base model with ECAPA-TDNN speaker encoder.

ElBruno.QwenTTS.Realtime

Bridge between ElBruno.QwenTTS and ElBruno.Realtime — provides ITextToSpeechClient adapter and DI extensions for QwenTTS integration with the real-time conversation pipeline.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.4.7 301 4/17/2026
1.4.6 125 4/16/2026
1.4.5 101 4/16/2026
1.4.4 97 4/16/2026
1.4.3 108 4/15/2026
1.4.2 100 4/14/2026
1.4.1 110 4/13/2026
1.4.0 104 4/12/2026
1.3.0 120 4/8/2026
1.2.3 109 4/6/2026
1.2.3-preview 96 4/6/2026
1.2.2-preview 97 4/5/2026
1.2.1-preview 99 4/5/2026
1.2.0 111 4/3/2026
1.1.1 125 4/2/2026
1.1.0 154 4/2/2026
1.0.2 41 5/20/2026
1.0.1 119 5/2/2026
0.6.1 155 2/28/2026
0.6.0 112 2/28/2026
Loading failed