XiaomiTTS.Core 1.0.0

dotnet add package XiaomiTTS.Core --version 1.0.0
                    
NuGet\Install-Package XiaomiTTS.Core -Version 1.0.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="XiaomiTTS.Core" Version="1.0.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="XiaomiTTS.Core" Version="1.0.0" />
                    
Directory.Packages.props
<PackageReference Include="XiaomiTTS.Core" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add XiaomiTTS.Core --version 1.0.0
                    
#r "nuget: XiaomiTTS.Core, 1.0.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package XiaomiTTS.Core@1.0.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=XiaomiTTS.Core&version=1.0.0
                    
Install as a Cake Addin
#tool nuget:?package=XiaomiTTS.Core&version=1.0.0
                    
Install as a Cake Tool

XiaomiTTS

A .NET 9 command-line tool for Xiaomi MiMo TTS (Text-to-Speech) synthesis. Converts text into WAV audio files by calling the MiMo platform REST API via HTTP/HTTPS.

中文文档 / Chinese Documentation

Features

  • High-quality speech synthesis powered by the mimo-v2-tts model
  • Supports both non-streaming and streaming (SSE) modes
  • Real-time audio chunk reception in streaming mode, ideal for long text synthesis
  • Generates standard WAV audio files (24 kHz / 16-bit / Mono)
  • Native AOT compilation support for zero-dependency single-file executables (~5.6 MB)
  • Real-time progress display, updated every second
  • Structured logging system powered by Serilog with multi-sink output
  • Comprehensive exception handling, error logging, and graceful error reporting
  • Dependency Injection architecture following .NET Generic Host patterns
  • Environment-based configuration with hot-reload support

Tech Stack

Backend Language & Runtime

Technology Version Role
C# 12.0 Primary language for TTS client, CLI parsing, and audio processing
.NET SDK 9.0 Build framework, tooling, and AOT compilation support
.NET Runtime 9.0 Application runtime environment with built-in GC and JIT
.NET Generic Host 9.0.0 Application lifecycle management, DI container, and configuration

Frameworks & Libraries

Technology Version Role
Microsoft.Extensions.Hosting 9.0.0 Generic Host builder, DI container, and app lifecycle orchestration
Microsoft.Extensions.Http 9.0.0 HttpClient factory and HTTP client management
Microsoft.Extensions.Configuration 9.0.0 JSON-based configuration with environment variable overrides
Microsoft.Extensions.Options 9.0.0 Strongly-typed options pattern for TTS configuration
Microsoft.Extensions.Logging 9.0.0 Logging abstraction layer for structured logging
System.Text.Json Built-in JSON serialization/deserialization with AOT-compatible source generators
System.Net.Http Built-in HttpClient for HTTP/HTTPS requests to the MiMo TTS API
System.IO Built-in Async file I/O operations, WAV file generation and validation

Logging & Observability

Technology Version Role
Serilog 4.2.0 Structured logging core engine with rich property support
Serilog.Extensions.Hosting 9.0.0 Seamless integration with .NET Generic Host logging pipeline
Serilog.Settings.Configuration 9.0.0 Drives Serilog configuration from appsettings.json files
Serilog.Sinks.Console 6.0.0 Console output sink with customizable formatting templates
Serilog.Sinks.Debug 3.0.0 Debug output sink for IDE debugger output window
Serilog.Sinks.File 6.0.0 Rolling file sink with daily rotation, size limits, and retention policies
Serilog.Sinks.Seq 8.0.0 Seq distributed log aggregation platform sink
Serilog.Enrichers.Thread 4.0.0 Enriches log events with thread ID and thread name
Serilog.Enrichers.Environment 3.0.1 Enriches log events with machine name and environment user name
Serilog.Expressions 5.0.0 Expression engine for structured log filtering and formatting

Third-Party APIs & Services

Technology Version Role
MiMo TTS API mimo-v2-tts Xiaomi speech synthesis REST API, core TTS engine
SSE (Server-Sent Events) Streaming protocol for real-time audio chunk delivery

Testing & Quality

Technology Version Role
xunit 2.9.2 Unit testing framework with parameterized test support
Microsoft.NET.Test.Sdk 17.12.0 .NET test runner infrastructure and test discovery
xunit.runner.visualstudio 3.0.0 Visual Studio / VS Code test adapter for xunit
Moq 4.20.72 Mocking framework for dependency isolation in unit tests
coverlet.collector 6.0.2 Code coverage data collector (Cobertura format)
coverlet.msbuild 6.0.2 MSBuild-integrated code coverage instrumentation

Build & Toolchain

Technology Version Role
MSBuild / dotnet CLI 9.0 Project build, restore, publish, and dependency management
Native AOT Compiles .NET apps to native machine code for zero-dependency deployment
PowerShell 5.1+ Test runner scripts and CI/CD automation
Visual Studio 2022 17.14+ IDE support; Windows AOT requires C++ desktop workload
VS Code + C# Dev Kit Cross-platform IDE support with IntelliSense and debugging

Project Structure

XiaomiTTS/
├── XiaomiTTS.sln                          # Visual Studio solution file
├── README.md                              # English documentation
├── README-ZhCn.md                         # Chinese documentation
├── LICENSE                                # MIT License
│
├── XiaomiTTS.Core/                        # Core business logic class library
│   ├── XiaomiTTS.Core.csproj
│   ├── Abstractions/
│   │   ├── IFileSystem.cs                 # File system abstraction interface
│   │   └── FileSystem.cs                  # File system implementation
│   ├── Audio/
│   │   └── WavProcessor.cs                # WAV header validation & PCM-to-WAV conversion
│   ├── Configuration/
│   │   └── TtsOptions.cs                  # Strongly-typed TTS configuration model
│   ├── DependencyInjection/
│   │   └── ServiceCollectionExtensions.cs # DI service registration extensions
│   ├── Interfaces/
│   │   ├── ITtsClient.cs                  # TTS client interface (Dispose pattern)
│   │   ├── ISynthesisStrategy.cs          # Synthesis strategy interface (Strategy pattern)
│   │   └── IWavProcessor.cs              # WAV processor interface
│   ├── Models/
│   │   ├── ProgressInfo.cs                # Progress reporting data model
│   │   └── Tts/
│   │       ├── TtsRequest.cs              # TTS API request model
│   │       ├── TtsMessage.cs              # Chat message model
│   │       ├── TtsAudioOptions.cs         # Audio format configuration model
│   │       └── TtsJsonContext.cs           # AOT-compatible JSON source generator
│   └── Services/
│       ├── TtsClient.cs                   # TTS client facade (strategy delegation)
│       ├── NonStreamingSynthesisStrategy.cs  # Non-streaming synthesis implementation
│       └── StreamingSynthesisStrategy.cs     # Streaming (SSE) synthesis implementation
│
├── XiaomiTTS.Cli/                         # Command-line application
│   ├── XiaomiTTS.Cli.csproj
│   ├── Program.cs                         # Entry point (async Main, DI setup, orchestration)
│   ├── Cli/
│   │   ├── CommandLineParser.cs           # CLI argument parser with validation
│   │   └── ProgressDisplay.cs             # Real-time console progress display
│   ├── Logging/
│   │   ├── SerilogConfiguration.cs        # Serilog logger factory with multi-environment support
│   │   ├── TraceIdEnricher.cs             # Custom enricher for distributed trace ID
│   │   └── ConfigurationChangeWatcher.cs  # Hot-reload configuration change monitor
│   ├── Models/
│   │   └── CommandLineArgs.cs             # Parsed command-line arguments model
│   ├── appsettings.json                   # Default configuration (TTS + Serilog)
│   ├── appsettings.Development.json       # Development environment overrides
│   └── appsettings.Production.json        # Production environment configuration
│
├── XiaomiTTS.Tests/                       # Unit test project
│   ├── XiaomiTTS.Tests.csproj
│   ├── CommandLineParserTests.cs          # CLI argument parsing tests
│   ├── TtsClientTests.cs                  # TTS client initialization tests
│   ├── NonStreamingSynthesisStrategyTests.cs  # Non-streaming synthesis tests
│   ├── StreamingSynthesisStrategyTests.cs     # Streaming synthesis tests
│   ├── WavProcessorTests.cs              # WAV header validation & PCM conversion tests
│   ├── ModelsTests.cs                     # Data model tests
│   └── LoggingTests.cs                    # Logging configuration tests
│
├── sample_input.txt                       # Chinese sample input text
├── sample_input_en.txt                    # English sample input text
├── sample_long.txt                        # Long text sample
├── sample_song.txt                        # Lyrics sample
├── sample_style.txt                       # Style sample
├── run-tests.ps1                          # PowerShell test runner with coverage support
├── spec/
│   └── text2voice.md                      # Project requirement specification
├── skill/
│   └── xiaomi-local-tts/
│       └── SKILL.md                       # AI-assisted skill configuration
└── .trae/
    ├── rules/
    │   └── unittesting.md                 # Unit testing guidelines
    └── skills/
        └── xiaomi-tts/
            └── skill.md                   # AI skill definition

Environment Requirements

Required Dependencies

Dependency Minimum Version Notes
.NET SDK 9.0 Required for building and running the project. Download
MiMo API Key Set via the MIMO_API_KEY environment variable; required for TTS synthesis

Optional Dependencies (Native AOT Publishing)

Dependency Platform Notes
Visual Studio 2022 Windows Must include the "Desktop development with C++" workload
clang Linux Required for Native AOT compilation on Linux
zlib1g-dev Linux Required for Native AOT compression support on Linux

Getting Started

1. Clone the Repository

git clone https://github.com/your-username/XiaomiTTS.git
cd XiaomiTTS

2. Set Environment Variables

Windows (PowerShell):

$env:MIMO_API_KEY = "your-api-key-here"

Windows (CMD):

set MIMO_API_KEY=your-api-key-here

Linux / macOS:

export MIMO_API_KEY="your-api-key-here"

To make the variable persistent, add it to your shell profile (e.g., ~/.bashrc, ~/.zshrc) or use User Secrets for development.

3. Build the Project

dotnet build

4. Run the Program

# Basic usage (non-streaming mode)
dotnet run -- project:XiaomiTTS.Cli -- input.txt

# Specify output filename
dotnet run -- project:XiaomiTTS.Cli -- input.txt output.wav

# Streaming mode (recommended for long text)
dotnet run -- project:XiaomiTTS.Cli -- input.txt --stream

# Streaming mode with output filename
dotnet run -- project:XiaomiTTS.Cli -- input.txt output.wav --stream

# Use sample text
dotnet run -- project:XiaomiTTS.Cli -- sample_input.txt

5. Run Tests

# Run all tests
dotnet test XiaomiTTS.Tests/XiaomiTTS.Tests.csproj

# Run tests with code coverage
.\run-tests.ps1 -Coverage

# Run tests with coverage and open the report
.\run-tests.ps1 -Coverage -OpenReport

# Run tests with filter
.\run-tests.ps1 -Filter "FullyQualifiedName~CommandLineParser"

6. Native AOT Publish (Optional)

# Windows x64
dotnet publish XiaomiTTS.Cli/XiaomiTTS.Cli.csproj -c Release -r win-x64 --self-contained true

# Linux x64
dotnet publish XiaomiTTS.Cli/XiaomiTTS.Cli.csproj -c Release -r linux-x64 --self-contained true

# macOS x64
dotnet publish XiaomiTTS.Cli/XiaomiTTS.Cli.csproj -c Release -r osx-x64 --self-contained true

Published executables are located in XiaomiTTS.Cli/bin/Release/net9.0/<RID>/publish/ (~5.6 MB), with no .NET Runtime dependency required on the target machine.

CLI Arguments

Positional Arguments

Position Required Description Default
1 Yes Input text file path (UTF-8 encoded)
2 No Output audio file path Same name as input with .wav extension

Named Arguments

Argument Description
--stream Enable streaming mode for real-time audio chunk reception via SSE
--help, -h Show help information

API Reference

Non-Streaming Mode

  • Endpoint: https://api.xiaomimimo.com/v1/chat/completions
  • Model: mimo-v2-tts
  • Auth: Authorization: Bearer <API_KEY>
  • Request Body:
{
  "model": "mimo-v2-tts",
  "messages": [
    {"role": "user", "content": "Convert the following text to speech"},
    {"role": "assistant", "content": "Text content to synthesize"}
  ],
  "voice": "mimo_default",
  "format": "wav"
}
  • Response: JSON containing choices[0].message.audio.data (Base64-encoded audio)

Streaming Mode

  • Endpoint: Same as above
  • Request Body:
{
  "model": "mimo-v2-tts",
  "messages": [
    {"role": "user", "content": "Convert the following text to speech"},
    {"role": "assistant", "content": "Text content to synthesize"}
  ],
  "audio": {
    "format": "pcm16",
    "voice": "mimo_default"
  },
  "stream": true
}
  • Response: SSE stream; each event contains choices[0].delta.audio.data (Base64-encoded PCM16 data)
  • Audio Parameters: 24 kHz, 16-bit, Mono PCM

Exit Codes

Code Description
0 Success
1 Error (missing env var, file not found, API failure, validation failure, etc.)

Development Guidelines

  • The project uses C# 12 syntax with nullable reference types and implicit usings enabled
  • JSON serialization uses System.Text.Json source generators (TtsJsonContext) for AOT compatibility
  • All asynchronous operations follow the async/await pattern with CancellationToken support for timeouts and cancellation
  • Architecture follows the Strategy Pattern for synthesis modes and Dependency Injection via .NET Generic Host
  • Configuration is managed through IOptions<T> pattern with environment-based overrides and hot-reload support
  • File system operations are abstracted behind IFileSystem for testability
  • All public types and methods include XML documentation comments
  • Unit tests cover CLI parsing, client initialization, synthesis strategy logic, WAV processing, and logging configuration

Common Issues & Troubleshooting

Q: "未找到环境变量 MIMO_API_KEY" / "MIMO_API_KEY not found"

Ensure the environment variable is set before running the program. Verify with:

# Linux / macOS
echo $MIMO_API_KEY

# Windows PowerShell
echo $env:MIMO_API_KEY

Q: Build error "SDK 'Microsoft.NET.Sdk' not found"

Install .NET 9 SDK: https://dotnet.microsoft.com/download/dotnet/9.0

Q: Native AOT publish fails (Windows)

Ensure Visual Studio 2022 is installed with the "Desktop development with C++" workload selected. The C++ toolchain is required for native code compilation.

Q: Audio generated in streaming mode cannot be played

Streaming mode outputs PCM16 format, which the program automatically wraps into a standard WAV file. If the file cannot be played:

  1. Check if the audio data from the API was received completely (look for timeout errors in logs)
  2. Verify the output file size is larger than the WAV header size (44 bytes)
  3. Try a shorter text input to rule out timeout issues

Q: Program exits with timeout (exit code 1)

The default timeout is 200 seconds. For very long texts, use streaming mode (--stream) to avoid timeouts. Streaming mode receives audio incrementally and is more resilient for large inputs.

Q: How to enable Seq log aggregation?

  1. Start a Seq instance (default: http://localhost:5341)
  2. Set "Seq": { "Enabled": true } in appsettings.json or via environment variables:
    $env:Seq__Enabled = "true"
    $env:Seq__ServerUrl = "http://localhost:5341"
    

Q: How to change log levels at runtime?

Modify appsettings.json while the application is running — the ConfigurationChangeWatcher detects changes and hot-reloads the Serilog minimum level without restarting the process.

References

License

This project is licensed under the MIT License.

Product Compatible and additional computed target framework versions.
.NET net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.0 40 5/29/2026