GeminiTtsCli 0.7.2
dotnet tool install --global GeminiTtsCli --version 0.7.2
dotnet new tool-manifest
dotnet tool install --local GeminiTtsCli --version 0.7.2
#tool dotnet:?package=GeminiTtsCli&version=0.7.2
nuke :add-package GeminiTtsCli --version 0.7.2
Gemini TTS CLI
A command-line interface tool for text-to-speech conversion using Google's Gemini TTS API.
Features
- Convert text to speech using Google Gemini TTS API
- Multiple voice options (male and female voices)
- Support for custom instructions
- Output to WAV format or stdout
- Merge multiple WAV files into one with glob pattern support
- Batch processing from text/markdown files
- Concurrency support for batch TTS
- Cross-platform support (Windows, Linux, macOS)
Installation
Via .NET Global Tool (Recommended)
dotnet tool install -g GeminiTtsCli
Manual Installation
Download the appropriate binary for your platform from the releases page.
Prerequisites
- Google AI Studio API Key: You need to obtain an API key from Google AI Studio
- Environment Variable: Set the
GEMINI_API_KEY
environment variable with your API key
Setting up the API Key
Windows (PowerShell)
$env:GEMINI_API_KEY = "your-api-key-here"
Windows (Command Prompt)
set GEMINI_API_KEY=your-api-key-here
Linux/macOS
export GEMINI_API_KEY="your-api-key-here"
To make it permanent, add the export line to your shell profile (.bashrc
, .zshrc
, etc.).
Usage
Single Text to Speech
gemini-tts -t "Text to convert" [-i "Your instructions"] [-s <voice-name>] [-o output.wav]
Batch Processing from File
gemini-tts -f "input.txt" [-i "Your instructions"] [-s <voice-name>] [-m] [-c 20] [-o output.wav]
Merge WAV Files
gemini-tts merge <glob-pattern> [-o <output-file>]
List Available Voices
gemini-tts list-voices
Parameters
-t
,--text
(optional): Text to convert to speech (required if no-f
)-f
,--file
(optional): File path for batch processing (.txt or .md files, required if no-t
)-i
,--instructions
(optional): Instructions for the TTS conversion (default: "Read aloud in a warm, professional and friendly tone")-s
,--speaker1
(optional): Voice name for the speaker (default: random selection from available voices)-o
,--outputfile
(optional): Output WAV filename (default: output.wav). Use "-" for stdout output-c
,--concurrency
(optional): Concurrent API requests for batch processing (default: 1)-m
,--merge
(optional): Merge all outputs into single file for batch processing
Merge Command Parameters
<glob-pattern>
(required): Pattern to match WAV files (e.g.,*.wav
,trial03-*.wav
,**/*.wav
)-o
,--outputfile
(optional): Output WAV filename
Available Voices
Female Voices
- achernar, aoede, autonoe, callirrhoe, despina, erinome, gacrux, kore
- laomedeia, leda, sulafat, zephyr, pulcherrima, vindemiatrix
Male Voices
- achird, algenib, algieba, alnilam, charon, enceladus, fenrir, iapetus
- orus, puck, rasalgethi, sadachbia, sadaltager, schedar, umbriel, zubenelgenubi
Examples
With custom instructions and specific voice:
gemini-tts -i "Read aloud in a warm, professional and friendly tone" -s achird -t "大家好,我是 Will 保哥。" -o my-name-is-will.wav
With minimal required parameters (uses defaults):
gemini-tts -t "Hello, this is a test of the Gemini TTS system"
With specific voice but default instructions:
gemini-tts -s zephyr -t "Hello, this is a test of the Gemini TTS system" -o greeting.wav
Batch processing from file, with merge and concurrency:
gemini-tts -f "test.txt" -s zephyr -m -c 5 -o batch-merged.wav
Batch processing from file, without merge (outputs numbered files):
gemini-tts -f "test.txt" -s zephyr -c 3
# Output: test-01.wav, test-02.wav, ...
List all available voices:
gemini-tts list-voices
Merge all WAV files in current directory:
gemini-tts merge '*.wav'
# Creates: merged.wav
Merge specific pattern with custom output:
gemini-tts merge 'trial03-*.wav' -o trial03-merged.wav
# Creates: trial03-merged.wav
Merge all WAV files recursively:
gemini-tts merge '**/*.wav' -o all-merged.wav
# Creates: all-merged.wav
Output to stdout (pipe to file or other processes):
gemini-tts -t "Hello world" -o - > output.wav
# or pipe to another command
gemini-tts -t "Hello world" -o - | aplay
Notes
- All input files for batch must have
.txt
or.md
extension - All input files for merge must have
.wav
extension - Files with different audio formats will be converted to match the first file's format
- The pattern must include
*.wav
to ensure only WAV files are processed - Recursive patterns (
**/*.wav
) will search subdirectories - Either
--text
or--file
must be provided, but not both - API key must be set in the
GEMINI_API_KEY
environment variable
Development
Building from Source
- Clone the repository
- Restore dependencies:
dotnet restore
- Build:
dotnet build
- Run:
dotnet run -- --text "Hello world"
Publishing
The project supports two types of builds:
1. Global Tool (for NuGet distribution)
dotnet pack --configuration Release
This creates a .NET global tool package that can be installed via dotnet tool install -g GeminiTtsCli
.
2. Self-Contained Executables (for standalone distribution)
For Windows x64:
dotnet publish --configuration Release --self-contained true --runtime win-x64 -p:PublishSelfContained=true
For Linux x64:
dotnet publish --configuration Release --self-contained true --runtime linux-x64 -p:PublishSelfContained=true
For macOS x64:
dotnet publish --configuration Release --self-contained true --runtime osx-x64 -p:PublishSelfContained=true
For macOS ARM64:
dotnet publish --configuration Release --self-contained true --runtime osx-arm64 -p:PublishSelfContained=true
The project includes automated GitHub Actions workflows for:
- Building cross-platform binaries
- Publishing to NuGet Gallery
- Creating GitHub releases
To trigger a release, create and push a git tag:
git tag v0.7.0
git push origin v0.7.0
License
MIT License - see LICENSE file for details.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
Support
If you encounter any issues or have questions, please open an issue on GitHub.
Acknowledgments
Google Cloud credits are provided for this project. #AISprint
We thank Google Cloud for supporting this project with credits that help make development and testing possible.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
This package has no dependencies.