VoiceMCP 0.0.2-prerelease
See the version list below for details.
{ "inputs": [ { "type": "promptString", "id": "AZURE_OPENAI_ENDPOINT", "description": "Azure OpenAI endpoint URL (e.g., https://your-resource.openai.azure.com/)" }, { "type": "promptString", "id": "AZURE_OPENAI_API_KEY", "description": "Azure OpenAI API key for authentication", "password": true }, { "type": "promptString", "id": "AZURE_OPENAI_TTS_DEPLOYMENT", "description": "Azure OpenAI TTS (Text-to-Speech) deployment name", "default": "tts" }, { "type": "promptString", "id": "AZURE_OPENAI_WHISPER_DEPLOYMENT", "description": "Azure OpenAI Whisper (Speech-to-Text) deployment name", "default": "whisper" } ], "servers": { "VoiceMCP": { "type": "stdio", "command": "dnx", "args": ["VoiceMCP@0.0.2-prerelease", "--yes"], "env": { "AZURE_OPENAI_ENDPOINT": "${input:AZURE_OPENAI_ENDPOINT}", "AZURE_OPENAI_API_KEY": "${input:AZURE_OPENAI_API_KEY}", "AZURE_OPENAI_TTS_DEPLOYMENT": "${input:AZURE_OPENAI_TTS_DEPLOYMENT}", "AZURE_OPENAI_WHISPER_DEPLOYMENT": "${input:AZURE_OPENAI_WHISPER_DEPLOYMENT}" } } } }
.vscode/mcp.json settings file.
dotnet tool install --global VoiceMCP --version 0.0.2-prerelease
dotnet new tool-manifest
dotnet tool install --local VoiceMCP --version 0.0.2-prerelease
#tool dotnet:?package=VoiceMCP&version=0.0.2-prerelease&prerelease
nuke :add-package VoiceMCP --version 0.0.2-prerelease
VoiceMCP - Voice-Enabled Model Context Protocol Server
A Model Context Protocol (MCP) server that enables AI agents to interact with users through voice, using Azure OpenAI's Text-to-Speech and Whisper speech recognition services. This allows AI assistants to ask questions, request approvals, and receive user input via natural voice conversations.
🎯 Features
- 🎤 Voice Input: Capture user responses via microphone using Azure OpenAI Whisper
- 🔊 Voice Output: Speak to users using Azure OpenAI Text-to-Speech
- ✅ Confirmation Loops: Built-in confirmation system to ensure accuracy
- 🔄 Retry Logic: Automatic retry with user feedback on failed recognitions
- 🎨 Multiple Tools: Ask questions, request approvals, and more
- 🔐 Flexible Configuration: Support for both environment variables and user secrets
📋 Table of Contents
- Architecture
- Prerequisites
- Installation
- Configuration
- Usage
- Available Tools
- Troubleshooting
- Contributing
- License
🏗️ Architecture
graph TD
A[MCP Client] -->|MCP Protocol| B[VoiceMCP Server]
B -->|Text-to-Speech| C[Azure OpenAI TTS]
B -->|Audio Recording| D[Microphone via NAudio]
D -->|WAV Audio| B
B -->|WAV to MP3| E[NAudio.Lame Encoder]
E -->|MP3 Audio| B
B -->|Speech-to-Text| F[Azure OpenAI Whisper]
C -->|Audio Playback| G[Speakers via NAudio]
style B fill:#4CAF50
style C fill:#2196F3
style F fill:#2196F3
Key Components
- VoiceMCP Server: MCP server exposing voice-based tools
- SemanticKernelVoiceService: Orchestrates TTS and STT using Semantic Kernel
- NAudio: Handles audio recording and playback
- Azure OpenAI: Provides TTS (Text-to-Speech) and Whisper (Speech-to-Text) services
📦 Prerequisites
- .NET 10.0 SDK or later
- Windows OS (required for NAudio and Windows Speech features)
- Azure OpenAI Account with:
- Text-to-Speech deployment (e.g.,
tts-1) - Whisper deployment (e.g.,
whisper-1)
- Text-to-Speech deployment (e.g.,
- Microphone for voice input
- Speakers for audio output
🚀 Installation
Option 1: Using .NET 10 dnx (Recommended for .NET 10+)
.NET 10 introduces the dotnet dnx command that downloads and runs tools without requiring global installation:
# Run VoiceMCP directly (downloads automatically on first run)
dotnet dnx VoiceMCP --version 0.0.1-prerelease
# Or run the latest stable version (when available)
dotnet dnx VoiceMCP
Advantages:
- No global installation required
- Automatically downloads the tool package
- Works seamlessly with MCP clients that use .NET 10 SDK
- Easier version management
Option 2: Install as Global .NET Tool
For .NET SDK versions before 10, or if you prefer global installation:
# Install the latest stable version
dotnet tool install --global VoiceMCP
# Or install a specific prerelease version
dotnet tool install --global VoiceMCP --version 0.0.1-prerelease
# Update to the latest version
dotnet tool update --global VoiceMCP
# The tool will be available as 'voicemcp' command
voicemcp
Note: The MCP server requires environment variables to be configured. When running directly without configuration, you'll see an error message indicating missing credentials. This is expected - see the Configuration section below.
Option 2: Build from Source
Clone the Repository:
git clone https://github.com/tamirdresher/mcp-voice-assist.git cd mcp-voice-assistRestore Dependencies:
dotnet restore VoiceMCP.slnBuild the Project:
dotnet build VoiceMCP.sln --configuration Release
⚙️ Configuration
VoiceMCP supports two configuration methods:
Option 1: Environment Variables (Recommended for MCP Clients)
Set these environment variables before running:
# Windows PowerShell
$env:AZURE_OPENAI_ENDPOINT = "https://your-resource.openai.azure.com/"
$env:AZURE_OPENAI_API_KEY = "your-api-key"
$env:AZURE_OPENAI_DEPLOYMENT = "gpt-4"
$env:AZURE_OPENAI_TTS_DEPLOYMENT = "tts-1"
$env:AZURE_OPENAI_WHISPER_DEPLOYMENT = "whisper-1"
Or use the provided setup script from the /scripts/setup/ directory:
.\scripts\setup\setup-azure-openai-env.ps1
Option 2: User Secrets (For Development)
cd VoiceMCP
dotnet user-secrets set "AzureOpenAI:Endpoint" "https://your-resource.openai.azure.com/"
dotnet user-secrets set "AzureOpenAI:ApiKey" "your-api-key"
dotnet user-secrets set "AzureOpenAI:DeploymentName" "gpt-4"
dotnet user-secrets set "AzureOpenAI:TtsDeploymentName" "tts-1"
dotnet user-secrets set "AzureOpenAI:WhisperDeploymentName" "whisper-1"
Or use the setup script from the /scripts/setup/ directory:
.\scripts\setup\setup-azure-secrets.ps1
MCP Client Configuration
Option 1: Using .NET 10 dnx (Recommended for .NET 10+)
Add VoiceMCP to your MCP client configuration using dotnet dnx:
{
"mcpServers": {
"voice-mcp": {
"command": "dotnet",
"args": ["dnx", "VoiceMCP", "--version", "0.0.1-prerelease"],
"env": {
"AZURE_OPENAI_ENDPOINT": "https://your-resource.openai.azure.com/",
"AZURE_OPENAI_API_KEY": "your-api-key",
"AZURE_OPENAI_TTS_DEPLOYMENT": "tts-1",
"AZURE_OPENAI_WHISPER_DEPLOYMENT": "whisper-1"
}
}
}
}
Benefits: No global installation needed, works directly with .NET 10 SDK.
Option 2: Using Globally Installed Tool
For systems with global installation:
{
"mcpServers": {
"voice-mcp": {
"command": "voicemcp",
"env": {
"AZURE_OPENAI_ENDPOINT": "https://your-resource.openai.azure.com/",
"AZURE_OPENAI_API_KEY": "your-api-key",
"AZURE_OPENAI_TTS_DEPLOYMENT": "tts-1",
"AZURE_OPENAI_WHISPER_DEPLOYMENT": "whisper-1"
}
}
}
}
Required Environment Variables:
AZURE_OPENAI_ENDPOINT: Your Azure OpenAI service endpointAZURE_OPENAI_API_KEY: Your Azure OpenAI API key (marked as secret)AZURE_OPENAI_TTS_DEPLOYMENT: Text-to-Speech deployment name (default: "tts")AZURE_OPENAI_WHISPER_DEPLOYMENT: Whisper speech recognition deployment name (default: "whisper")VOICE(optional): Voice selection for TTS (default: "alloy")
Option 3: Using Source Code (Development)
For development or testing with source code:
{
"mcpServers": {
"voice-mcp": {
"command": "dotnet",
"args": [
"run",
"--no-build",
"--project",
"C:/path/to/mcp-voice-assist/VoiceMCP/VoiceMCP.csproj"
],
"env": {
"AZURE_OPENAI_ENDPOINT": "https://your-resource.openai.azure.com/",
"AZURE_OPENAI_API_KEY": "your-api-key",
"AZURE_OPENAI_TTS_DEPLOYMENT": "tts-1",
"AZURE_OPENAI_WHISPER_DEPLOYMENT": "whisper-1"
}
}
}
}
📚 Available Tools
VoiceMCP exposes the following tools to MCP clients:
voice_ask_user
Ask the user a question via voice and receive a voice response.
Parameters:
question(string): The question to ask the userconfirm(boolean, optional): Whether to ask for confirmation (default: true)
voice_ask_for_approval
Request user approval for an action via voice.
Parameters:
action_description(string): Description of the action requiring approvaldetails(string, optional): Additional details about the action
Example Usage
{
"tool": "voice_ask_user",
"arguments": {
"question": "What is your preferred programming language?",
"confirm": true
}
}
🛠️ Scripts
The repository includes helpful PowerShell scripts organized in the /scripts/ directory:
Setup Scripts (/scripts/setup/)
setup-azure-openai-env.ps1- Configure Azure OpenAI environment variablessetup-azure-secrets.ps1- Set up user secrets for developmentsetup-github-repo.ps1- Initialize GitHub repository settings
Test Scripts (/scripts/test/)
test-mcp.ps1- Test MCP server functionalitytest-mcp-interactive.ps1- Interactive MCP testingtest-stdio-separation.ps1- Test stdio communicationtest-voice-tool.ps1- Test voice tool functionality
See scripts/README.md for detailed documentation.
🔍 Troubleshooting
Installation Issues
Issue: Tool 'VoiceMCP' failed to install
- Solution: Ensure you have .NET 10.0 SDK installed:
dotnet --version - Verify NuGet package is available: https://www.nuget.org/packages/VoiceMCP
Issue: TTS deployment name is not configured
- Solution: This error appears when environment variables are missing. Configure the required environment variables as shown in the Configuration section.
Runtime Issues
Issue: No valid credentials found
- Solution: Set the required environment variables before running voicemcp
- For MCP clients: Add environment variables to your MCP configuration
- For development: Use user secrets or the setup scripts
Issue: Microphone not detected
- Solution:
- Ensure your microphone is connected and enabled in Windows Sound settings
- Grant microphone permissions to the application
- Check Windows Privacy settings for microphone access
Issue: Audio playback not working
- Solution:
- Verify speakers/headphones are connected
- Check Windows Sound settings
- Ensure audio output device is set as default
Getting Help
For additional support:
- Check the scripts/README.md for script documentation
- Review existing GitHub Issues
- Create a new issue with:
- Error messages
- Steps to reproduce
- Your environment (.NET version, Windows version, etc.)
🤝 Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Development Setup
- Clone the repository
- Open
VoiceMCP.slnin Visual Studio or VS Code - Configure user secrets (see Configuration section)
- Build and run
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Model Context Protocol - MCP specification
- Semantic Kernel - AI orchestration framework
- NAudio - Audio library for .NET
- Azure OpenAI Service - AI services
📧 Support
For issues, questions, or contributions:
- Open an issue on GitHub
- Check existing documentation in the
/docsfolder - Review troubleshooting section above
Made with ❤️ for voice-enabled AI interactions
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
This package has no dependencies.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.0.2-prerelease2 | 294 | 11/30/2025 |
| 0.0.2-prerelease | 197 | 11/30/2025 |