VoiceMCP 0.0.2-prerelease

This is a prerelease version of VoiceMCP.
There is a newer prerelease version of this package available.
See the version list below for details.
{
  "inputs": [
    {
      "type": "promptString",
      "id": "AZURE_OPENAI_ENDPOINT",
      "description": "Azure OpenAI endpoint URL (e.g., https://your-resource.openai.azure.com/)"
    },
    {
      "type": "promptString",
      "id": "AZURE_OPENAI_API_KEY",
      "description": "Azure OpenAI API key for authentication",
      "password": true
    },
    {
      "type": "promptString",
      "id": "AZURE_OPENAI_TTS_DEPLOYMENT",
      "description": "Azure OpenAI TTS (Text-to-Speech) deployment name",
      "default": "tts"
    },
    {
      "type": "promptString",
      "id": "AZURE_OPENAI_WHISPER_DEPLOYMENT",
      "description": "Azure OpenAI Whisper (Speech-to-Text) deployment name",
      "default": "whisper"
    }
  ],
  "servers": {
    "VoiceMCP": {
      "type": "stdio",
      "command": "dnx",
      "args": ["VoiceMCP@0.0.2-prerelease", "--yes"],
      "env": {
        "AZURE_OPENAI_ENDPOINT": "${input:AZURE_OPENAI_ENDPOINT}",
        "AZURE_OPENAI_API_KEY": "${input:AZURE_OPENAI_API_KEY}",
        "AZURE_OPENAI_TTS_DEPLOYMENT": "${input:AZURE_OPENAI_TTS_DEPLOYMENT}",
        "AZURE_OPENAI_WHISPER_DEPLOYMENT": "${input:AZURE_OPENAI_WHISPER_DEPLOYMENT}"
      }
    }
  }
}
                    
This package contains an MCP Server. The server can be used in VS Code by copying the generated JSON to your VS Code workspace's .vscode/mcp.json settings file.
dotnet tool install --global VoiceMCP --version 0.0.2-prerelease
                    
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
                    
if you are setting up this repo
dotnet tool install --local VoiceMCP --version 0.0.2-prerelease
                    
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=VoiceMCP&version=0.0.2-prerelease&prerelease
                    
nuke :add-package VoiceMCP --version 0.0.2-prerelease
                    

VoiceMCP - Voice-Enabled Model Context Protocol Server

NuGet License: MIT .NET MCP

A Model Context Protocol (MCP) server that enables AI agents to interact with users through voice, using Azure OpenAI's Text-to-Speech and Whisper speech recognition services. This allows AI assistants to ask questions, request approvals, and receive user input via natural voice conversations.

🎯 Features

  • 🎤 Voice Input: Capture user responses via microphone using Azure OpenAI Whisper
  • 🔊 Voice Output: Speak to users using Azure OpenAI Text-to-Speech
  • ✅ Confirmation Loops: Built-in confirmation system to ensure accuracy
  • 🔄 Retry Logic: Automatic retry with user feedback on failed recognitions
  • 🎨 Multiple Tools: Ask questions, request approvals, and more
  • 🔐 Flexible Configuration: Support for both environment variables and user secrets

📋 Table of Contents

🏗️ Architecture

graph TD
    A[MCP Client] -->|MCP Protocol| B[VoiceMCP Server]
    B -->|Text-to-Speech| C[Azure OpenAI TTS]
    B -->|Audio Recording| D[Microphone via NAudio]
    D -->|WAV Audio| B
    B -->|WAV to MP3| E[NAudio.Lame Encoder]
    E -->|MP3 Audio| B
    B -->|Speech-to-Text| F[Azure OpenAI Whisper]
    C -->|Audio Playback| G[Speakers via NAudio]
    
    style B fill:#4CAF50
    style C fill:#2196F3
    style F fill:#2196F3

Key Components

  • VoiceMCP Server: MCP server exposing voice-based tools
  • SemanticKernelVoiceService: Orchestrates TTS and STT using Semantic Kernel
  • NAudio: Handles audio recording and playback
  • Azure OpenAI: Provides TTS (Text-to-Speech) and Whisper (Speech-to-Text) services

📦 Prerequisites

  • .NET 10.0 SDK or later
  • Windows OS (required for NAudio and Windows Speech features)
  • Azure OpenAI Account with:
    • Text-to-Speech deployment (e.g., tts-1)
    • Whisper deployment (e.g., whisper-1)
  • Microphone for voice input
  • Speakers for audio output

🚀 Installation

.NET 10 introduces the dotnet dnx command that downloads and runs tools without requiring global installation:

# Run VoiceMCP directly (downloads automatically on first run)
dotnet dnx VoiceMCP --version 0.0.1-prerelease

# Or run the latest stable version (when available)
dotnet dnx VoiceMCP

Advantages:

  • No global installation required
  • Automatically downloads the tool package
  • Works seamlessly with MCP clients that use .NET 10 SDK
  • Easier version management

Option 2: Install as Global .NET Tool

For .NET SDK versions before 10, or if you prefer global installation:

# Install the latest stable version
dotnet tool install --global VoiceMCP

# Or install a specific prerelease version
dotnet tool install --global VoiceMCP --version 0.0.1-prerelease

# Update to the latest version
dotnet tool update --global VoiceMCP

# The tool will be available as 'voicemcp' command
voicemcp

Note: The MCP server requires environment variables to be configured. When running directly without configuration, you'll see an error message indicating missing credentials. This is expected - see the Configuration section below.

Option 2: Build from Source

  1. Clone the Repository:

    git clone https://github.com/tamirdresher/mcp-voice-assist.git
    cd mcp-voice-assist
    
  2. Restore Dependencies:

    dotnet restore VoiceMCP.sln
    
  3. Build the Project:

    dotnet build VoiceMCP.sln --configuration Release
    

⚙️ Configuration

VoiceMCP supports two configuration methods:

Set these environment variables before running:

# Windows PowerShell
$env:AZURE_OPENAI_ENDPOINT = "https://your-resource.openai.azure.com/"
$env:AZURE_OPENAI_API_KEY = "your-api-key"
$env:AZURE_OPENAI_DEPLOYMENT = "gpt-4"
$env:AZURE_OPENAI_TTS_DEPLOYMENT = "tts-1"
$env:AZURE_OPENAI_WHISPER_DEPLOYMENT = "whisper-1"

Or use the provided setup script from the /scripts/setup/ directory:

.\scripts\setup\setup-azure-openai-env.ps1

Option 2: User Secrets (For Development)

cd VoiceMCP
dotnet user-secrets set "AzureOpenAI:Endpoint" "https://your-resource.openai.azure.com/"
dotnet user-secrets set "AzureOpenAI:ApiKey" "your-api-key"
dotnet user-secrets set "AzureOpenAI:DeploymentName" "gpt-4"
dotnet user-secrets set "AzureOpenAI:TtsDeploymentName" "tts-1"
dotnet user-secrets set "AzureOpenAI:WhisperDeploymentName" "whisper-1"

Or use the setup script from the /scripts/setup/ directory:

.\scripts\setup\setup-azure-secrets.ps1

MCP Client Configuration

Add VoiceMCP to your MCP client configuration using dotnet dnx:

{
  "mcpServers": {
    "voice-mcp": {
      "command": "dotnet",
      "args": ["dnx", "VoiceMCP", "--version", "0.0.1-prerelease"],
      "env": {
        "AZURE_OPENAI_ENDPOINT": "https://your-resource.openai.azure.com/",
        "AZURE_OPENAI_API_KEY": "your-api-key",
        "AZURE_OPENAI_TTS_DEPLOYMENT": "tts-1",
        "AZURE_OPENAI_WHISPER_DEPLOYMENT": "whisper-1"
      }
    }
  }
}

Benefits: No global installation needed, works directly with .NET 10 SDK.

Option 2: Using Globally Installed Tool

For systems with global installation:

{
  "mcpServers": {
    "voice-mcp": {
      "command": "voicemcp",
      "env": {
        "AZURE_OPENAI_ENDPOINT": "https://your-resource.openai.azure.com/",
        "AZURE_OPENAI_API_KEY": "your-api-key",
        "AZURE_OPENAI_TTS_DEPLOYMENT": "tts-1",
        "AZURE_OPENAI_WHISPER_DEPLOYMENT": "whisper-1"
      }
    }
  }
}

Required Environment Variables:

  • AZURE_OPENAI_ENDPOINT: Your Azure OpenAI service endpoint
  • AZURE_OPENAI_API_KEY: Your Azure OpenAI API key (marked as secret)
  • AZURE_OPENAI_TTS_DEPLOYMENT: Text-to-Speech deployment name (default: "tts")
  • AZURE_OPENAI_WHISPER_DEPLOYMENT: Whisper speech recognition deployment name (default: "whisper")
  • VOICE (optional): Voice selection for TTS (default: "alloy")
Option 3: Using Source Code (Development)

For development or testing with source code:

{
  "mcpServers": {
    "voice-mcp": {
      "command": "dotnet",
      "args": [
        "run",
        "--no-build",
        "--project",
        "C:/path/to/mcp-voice-assist/VoiceMCP/VoiceMCP.csproj"
      ],
      "env": {
        "AZURE_OPENAI_ENDPOINT": "https://your-resource.openai.azure.com/",
        "AZURE_OPENAI_API_KEY": "your-api-key",
        "AZURE_OPENAI_TTS_DEPLOYMENT": "tts-1",
        "AZURE_OPENAI_WHISPER_DEPLOYMENT": "whisper-1"
      }
    }
  }
}

📚 Available Tools

VoiceMCP exposes the following tools to MCP clients:

voice_ask_user

Ask the user a question via voice and receive a voice response.

Parameters:

  • question (string): The question to ask the user
  • confirm (boolean, optional): Whether to ask for confirmation (default: true)

voice_ask_for_approval

Request user approval for an action via voice.

Parameters:

  • action_description (string): Description of the action requiring approval
  • details (string, optional): Additional details about the action

Example Usage

{
  "tool": "voice_ask_user",
  "arguments": {
    "question": "What is your preferred programming language?",
    "confirm": true
  }
}

🛠️ Scripts

The repository includes helpful PowerShell scripts organized in the /scripts/ directory:

Setup Scripts (/scripts/setup/)

  • setup-azure-openai-env.ps1 - Configure Azure OpenAI environment variables
  • setup-azure-secrets.ps1 - Set up user secrets for development
  • setup-github-repo.ps1 - Initialize GitHub repository settings

Test Scripts (/scripts/test/)

  • test-mcp.ps1 - Test MCP server functionality
  • test-mcp-interactive.ps1 - Interactive MCP testing
  • test-stdio-separation.ps1 - Test stdio communication
  • test-voice-tool.ps1 - Test voice tool functionality

See scripts/README.md for detailed documentation.

🔍 Troubleshooting

Installation Issues

Issue: Tool 'VoiceMCP' failed to install

Issue: TTS deployment name is not configured

  • Solution: This error appears when environment variables are missing. Configure the required environment variables as shown in the Configuration section.

Runtime Issues

Issue: No valid credentials found

  • Solution: Set the required environment variables before running voicemcp
  • For MCP clients: Add environment variables to your MCP configuration
  • For development: Use user secrets or the setup scripts

Issue: Microphone not detected

  • Solution:
    • Ensure your microphone is connected and enabled in Windows Sound settings
    • Grant microphone permissions to the application
    • Check Windows Privacy settings for microphone access

Issue: Audio playback not working

  • Solution:
    • Verify speakers/headphones are connected
    • Check Windows Sound settings
    • Ensure audio output device is set as default

Getting Help

For additional support:

  1. Check the scripts/README.md for script documentation
  2. Review existing GitHub Issues
  3. Create a new issue with:
    • Error messages
    • Steps to reproduce
    • Your environment (.NET version, Windows version, etc.)

🤝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Development Setup

  1. Clone the repository
  2. Open VoiceMCP.sln in Visual Studio or VS Code
  3. Configure user secrets (see Configuration section)
  4. Build and run

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📧 Support

For issues, questions, or contributions:

  • Open an issue on GitHub
  • Check existing documentation in the /docs folder
  • Review troubleshooting section above

Made with ❤️ for voice-enabled AI interactions

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

Version Downloads Last Updated
0.0.2-prerelease2 294 11/30/2025
0.0.2-prerelease 197 11/30/2025