Ai.Tlbx.VoiceAssistant
5.0.3
dotnet add package Ai.Tlbx.VoiceAssistant --version 5.0.3
NuGet\Install-Package Ai.Tlbx.VoiceAssistant -Version 5.0.3
<PackageReference Include="Ai.Tlbx.VoiceAssistant" Version="5.0.3" />
<PackageVersion Include="Ai.Tlbx.VoiceAssistant" Version="5.0.3" />
<PackageReference Include="Ai.Tlbx.VoiceAssistant" />
paket add Ai.Tlbx.VoiceAssistant --version 5.0.3
#r "nuget: Ai.Tlbx.VoiceAssistant, 5.0.3"
#:package Ai.Tlbx.VoiceAssistant@5.0.3
#addin nuget:?package=Ai.Tlbx.VoiceAssistant&version=5.0.3
#tool nuget:?package=Ai.Tlbx.VoiceAssistant&version=5.0.3
AI Voice Assistant Toolkit
A comprehensive .NET 9 toolkit for building real-time AI voice assistants with support for multiple platforms and AI providers. This library provides a clean, modular architecture for integrating voice-based AI interactions into your applications.
Features
- 🎙️ Real-time voice interaction with AI assistants
- 🌐 Multi-platform support: Windows, Linux, Web (Blazor)
- 🤖 Provider-agnostic design (currently supports OpenAI Realtime API)
- 🎵 High-quality audio processing with upsampling and EQ
- 💬 Conversation history persistence across sessions
- 🎛️ Customizable voice settings (speed, voice selection)
- 🛠️ Extensible tool system for custom AI capabilities
- 📊 Built-in logging architecture with user-controlled configuration
- 🎧 Bluetooth-friendly audio initialization
Table of Contents
- Installation
- Quick Start
- Platform-Specific Guides
- Architecture Overview
- API Reference
- Advanced Topics
- Troubleshooting
- Migration from v3.x
Installation
NuGet Packages
Install the packages you need for your platform:
# Core package (required)
dotnet add package Ai.Tlbx.VoiceAssistant
# Provider packages (choose one or more)
dotnet add package Ai.Tlbx.VoiceAssistant.Provider.OpenAi
# Platform packages (choose based on your target)
dotnet add package Ai.Tlbx.VoiceAssistant.Hardware.Windows # For Windows
dotnet add package Ai.Tlbx.VoiceAssistant.Hardware.Linux # For Linux
dotnet add package Ai.Tlbx.VoiceAssistant.Hardware.Web # For Blazor
# Optional UI components for Blazor
dotnet add package Ai.Tlbx.VoiceAssistant.WebUi
Requirements
- .NET 9.0 or later
- OpenAI API key (for OpenAI provider)
- Platform-specific requirements:
- Windows: Windows 10 or later
- Linux: ALSA libraries (
sudo apt-get install libasound2-dev
) - Web: Modern browser with microphone permissions
Quick Start
Here's a minimal example to get you started:
using Ai.Tlbx.VoiceAssistant;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi.Models;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi.Extensions;
using Ai.Tlbx.VoiceAssistant.Hardware.Windows; // or .Hardware.Linux or .Hardware.Web
using Microsoft.Extensions.DependencyInjection;
// Configure services
var services = new ServiceCollection();
services.AddVoiceAssistant()
.WithOpenAi(apiKey: "your-api-key-here")
.WithHardware<WindowsAudioDevice>(); // or WithHardware<LinuxAudioDevice>() or WithHardware<WebAudioAccess>()
var serviceProvider = services.BuildServiceProvider();
var voiceAssistant = serviceProvider.GetRequiredService<VoiceAssistant>();
// Configure and start
var settings = new OpenAiVoiceSettings
{
Voice = AssistantVoice.Alloy,
Instructions = "You are a helpful assistant.",
TalkingSpeed = 1.0,
Model = OpenAiRealtimeModel.Gpt4oRealtimePreview20250603
};
await voiceAssistant.StartAsync(settings);
// The assistant is now listening...
// Stop when done
await voiceAssistant.StopAsync();
Platform-Specific Guides
Windows Implementation
Complete Windows Console Application
using System;
using System.Threading.Tasks;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Ai.Tlbx.VoiceAssistant;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi.Models;
class Program
{
static async Task Main(string[] args)
{
// Set up dependency injection
var services = new ServiceCollection();
// Configure logging
services.AddLogging(builder =>
{
builder.SetMinimumLevel(LogLevel.Information)
.AddConsole();
});
// Configure voice assistant
services.AddVoiceAssistant()
.WithOpenAi(apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
.WithHardware<WindowsAudioDevice>();
var serviceProvider = services.BuildServiceProvider();
var voiceAssistant = serviceProvider.GetRequiredService<VoiceAssistant>();
var logger = serviceProvider.GetRequiredService<ILogger<Program>>();
// Set up event handlers
voiceAssistant.OnMessageAdded = (message) =>
{
Console.WriteLine($"[{message.Role}]: {message.Content}");
};
voiceAssistant.OnConnectionStatusChanged = (status) =>
{
logger.LogInformation($"Status: {status}");
};
// Get available microphones
var microphones = await voiceAssistant.GetAvailableMicrophonesAsync();
Console.WriteLine("Available microphones:");
for (int i = 0; i < microphones.Count; i++)
{
Console.WriteLine($"{i}: {microphones[i].Name} {(microphones[i].IsDefault ? "(Default)" : "")}");
}
// Configure settings
var settings = new OpenAiVoiceSettings
{
Voice = AssistantVoice.Alloy,
Instructions = "You are a helpful AI assistant. Be friendly and conversational.",
TalkingSpeed = 1.0,
Model = OpenAiRealtimeModel.Gpt4oRealtimePreview20250603
};
// Start the assistant
Console.WriteLine("Starting voice assistant... Press any key to stop.");
await voiceAssistant.StartAsync(settings);
// Wait for user to stop
Console.ReadKey();
// Stop the assistant
await voiceAssistant.StopAsync();
Console.WriteLine("Voice assistant stopped.");
}
}
Windows Forms Application
using System;
using System.Windows.Forms;
using Microsoft.Extensions.DependencyInjection;
using Ai.Tlbx.VoiceAssistant;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi.Models;
public partial class MainForm : Form
{
private readonly VoiceAssistant _voiceAssistant;
private readonly IServiceProvider _serviceProvider;
private Button _talkButton;
private ListBox _chatHistory;
private ComboBox _microphoneCombo;
public MainForm()
{
InitializeComponent();
// Set up DI
var services = new ServiceCollection();
services.AddVoiceAssistant()
.WithOpenAi(apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
.WithHardware<WindowsAudioDevice>();
_serviceProvider = services.BuildServiceProvider();
_voiceAssistant = _serviceProvider.GetRequiredService<VoiceAssistant>();
// Set up event handlers
_voiceAssistant.OnMessageAdded = OnMessageAdded;
_voiceAssistant.OnConnectionStatusChanged = OnStatusChanged;
// Load microphones
LoadMicrophones();
}
private void InitializeComponent()
{
// Set up UI controls
_talkButton = new Button
{
Text = "Talk",
Size = new Size(100, 50),
Location = new Point(10, 10)
};
_talkButton.Click += TalkButton_Click;
_microphoneCombo = new ComboBox
{
Location = new Point(120, 20),
Size = new Size(200, 25),
DropDownStyle = ComboBoxStyle.DropDownList
};
_chatHistory = new ListBox
{
Location = new Point(10, 70),
Size = new Size(400, 300)
};
Controls.AddRange(new Control[] { _talkButton, _microphoneCombo, _chatHistory });
Text = "Voice Assistant";
Size = new Size(450, 450);
}
private async void LoadMicrophones()
{
var mics = await _voiceAssistant.GetAvailableMicrophonesAsync();
_microphoneCombo.Items.Clear();
foreach (var mic in mics)
{
_microphoneCombo.Items.Add(new MicrophoneItem
{
Info = mic,
Display = $"{mic.Name} {(mic.IsDefault ? "(Default)" : "")}"
});
}
// Select default
for (int i = 0; i < _microphoneCombo.Items.Count; i++)
{
if (((MicrophoneItem)_microphoneCombo.Items[i]).Info.IsDefault)
{
_microphoneCombo.SelectedIndex = i;
break;
}
}
}
private async void TalkButton_Click(object sender, EventArgs e)
{
if (_voiceAssistant.IsRecording)
{
await _voiceAssistant.StopAsync();
_talkButton.Text = "Talk";
}
else
{
var settings = new OpenAiVoiceSettings
{
Voice = AssistantVoice.Alloy,
Instructions = "You are a helpful assistant.",
TalkingSpeed = 1.0
};
await _voiceAssistant.StartAsync(settings);
_talkButton.Text = "Stop";
}
}
private void OnMessageAdded(ChatMessage message)
{
Invoke(new Action(() =>
{
_chatHistory.Items.Add($"[{message.Role}]: {message.Content}");
_chatHistory.SelectedIndex = _chatHistory.Items.Count - 1;
}));
}
private void OnStatusChanged(string status)
{
Invoke(new Action(() =>
{
Text = $"Voice Assistant - {status}";
}));
}
private class MicrophoneItem
{
public AudioDeviceInfo Info { get; set; }
public string Display { get; set; }
public override string ToString() => Display;
}
}
Linux Implementation
Complete Linux Console Application
using System;
using System.Threading.Tasks;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Ai.Tlbx.VoiceAssistant;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi.Models;
class Program
{
static async Task Main(string[] args)
{
// Ensure ALSA is available
if (!CheckAlsaAvailable())
{
Console.WriteLine("ALSA is not available. Install with: sudo apt-get install libasound2-dev");
return;
}
// Set up dependency injection
var services = new ServiceCollection();
// Configure logging
services.AddLogging(builder =>
{
builder.SetMinimumLevel(LogLevel.Information)
.AddConsole();
});
// Configure voice assistant
services.AddVoiceAssistant()
.WithOpenAi(apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
.WithHardware<LinuxAudioDevice>();
var serviceProvider = services.BuildServiceProvider();
var voiceAssistant = serviceProvider.GetRequiredService<VoiceAssistant>();
// Set up event handlers
voiceAssistant.OnMessageAdded = (message) =>
{
Console.WriteLine($"\033[1m[{message.Role}]\033[0m: {message.Content}");
};
voiceAssistant.OnConnectionStatusChanged = (status) =>
{
Console.WriteLine($"\033[33mStatus: {status}\033[0m");
};
// List available devices
var microphones = await voiceAssistant.GetAvailableMicrophonesAsync();
Console.WriteLine("\033[36mAvailable microphones:\033[0m");
foreach (var mic in microphones)
{
Console.WriteLine($" - {mic.Name} {(mic.IsDefault ? "\033[32m(Default)\033[0m" : "")}");
}
// Configure settings
var settings = new OpenAiVoiceSettings
{
Voice = AssistantVoice.Alloy,
Instructions = "You are a helpful Linux terminal assistant.",
TalkingSpeed = 1.0
};
// Start the assistant
Console.WriteLine("\n\033[32mStarting voice assistant... Press 'q' to quit.\033[0m");
await voiceAssistant.StartAsync(settings);
// Wait for quit command
while (Console.ReadKey(true).KeyChar != 'q')
{
// Keep running
}
// Stop the assistant
await voiceAssistant.StopAsync();
Console.WriteLine("\n\033[31mVoice assistant stopped.\033[0m");
}
static bool CheckAlsaAvailable()
{
try
{
// Try to load ALSA library
return System.IO.File.Exists("/usr/lib/x86_64-linux-gnu/libasound.so.2") ||
System.IO.File.Exists("/usr/lib/libasound.so.2");
}
catch
{
return false;
}
}
}
Linux Avalonia UI Application
using Avalonia;
using Avalonia.Controls;
using Avalonia.Interactivity;
using Avalonia.Threading;
using Microsoft.Extensions.DependencyInjection;
using Ai.Tlbx.VoiceAssistant;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi.Models;
public class MainWindow : Window
{
private readonly VoiceAssistant _voiceAssistant;
private Button _talkButton;
private ListBox _chatHistory;
private ComboBox _microphoneCombo;
private TextBlock _statusText;
public MainWindow()
{
// Set up DI
var services = new ServiceCollection();
services.AddVoiceAssistant()
.WithOpenAi(apiKey: Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
.WithHardware<LinuxAudioDevice>();
var serviceProvider = services.BuildServiceProvider();
_voiceAssistant = serviceProvider.GetRequiredService<VoiceAssistant>();
// Set up event handlers
_voiceAssistant.OnMessageAdded = OnMessageAdded;
_voiceAssistant.OnConnectionStatusChanged = OnStatusChanged;
InitializeComponent();
LoadMicrophones();
}
private void InitializeComponent()
{
Title = "Voice Assistant - Linux";
Width = 600;
Height = 500;
var panel = new StackPanel { Margin = new Thickness(10) };
// Controls row
var controlsPanel = new StackPanel
{
Orientation = Avalonia.Layout.Orientation.Horizontal,
Spacing = 10,
Margin = new Thickness(0, 0, 0, 10)
};
_talkButton = new Button
{
Content = "Talk",
Width = 100,
Height = 40
};
_talkButton.Click += TalkButton_Click;
_microphoneCombo = new ComboBox
{
Width = 300,
Height = 40
};
controlsPanel.Children.Add(_talkButton);
controlsPanel.Children.Add(_microphoneCombo);
// Status
_statusText = new TextBlock
{
Text = "Ready",
Margin = new Thickness(0, 0, 0, 10)
};
// Chat history
_chatHistory = new ListBox
{
Height = 350
};
panel.Children.Add(controlsPanel);
panel.Children.Add(_statusText);
panel.Children.Add(_chatHistory);
Content = panel;
}
private async void LoadMicrophones()
{
var mics = await _voiceAssistant.GetAvailableMicrophonesAsync();
var items = new List<MicrophoneItem>();
foreach (var mic in mics)
{
items.Add(new MicrophoneItem
{
Info = mic,
Display = $"{mic.Name} {(mic.IsDefault ? "(Default)" : "")}"
});
}
_microphoneCombo.Items = items;
// Select default
var defaultMic = items.FirstOrDefault(m => m.Info.IsDefault);
if (defaultMic != null)
{
_microphoneCombo.SelectedItem = defaultMic;
}
}
private async void TalkButton_Click(object sender, RoutedEventArgs e)
{
if (_voiceAssistant.IsRecording)
{
await _voiceAssistant.StopAsync();
_talkButton.Content = "Talk";
}
else
{
var settings = new OpenAiVoiceSettings
{
Voice = AssistantVoice.Alloy,
Instructions = "You are a helpful Linux assistant.",
TalkingSpeed = 1.0
};
await _voiceAssistant.StartAsync(settings);
_talkButton.Content = "Stop";
}
}
private void OnMessageAdded(ChatMessage message)
{
Dispatcher.UIThread.Post(() =>
{
_chatHistory.Items.Add($"[{message.Role}]: {message.Content}");
_chatHistory.ScrollIntoView(_chatHistory.Items[_chatHistory.Items.Count - 1]);
});
}
private void OnStatusChanged(string status)
{
Dispatcher.UIThread.Post(() =>
{
_statusText.Text = $"Status: {status}";
});
}
private class MicrophoneItem
{
public AudioDeviceInfo Info { get; set; }
public string Display { get; set; }
public override string ToString() => Display;
}
}
Web/Blazor Implementation
Blazor Server Application
Program.cs
using Ai.Tlbx.VoiceAssistant;
using Ai.Tlbx.VoiceAssistant.Provider.OpenAi;
var builder = WebApplication.CreateBuilder(args);
// Add services
builder.Services.AddRazorPages();
builder.Services.AddServerSideBlazor();
// Configure Voice Assistant
builder.Services.AddVoiceAssistant()
.WithOpenAi(apiKey: builder.Configuration["OpenAI:ApiKey"])
.WithHardware<WebAudioAccess>();
var app = builder.Build();
// Configure pipeline
if (!app.Environment.IsDevelopment())
{
app.UseExceptionHandler("/Error");
app.UseHsts();
}
app.UseHttpsRedirection();
app.UseStaticFiles();
app.UseRouting();
app.MapBlazorHub();
app.MapFallbackToPage("/_Host");
app.Run();
Pages/VoiceChat.razor
@page "/voice-chat"
@using Ai.Tlbx.VoiceAssistant
@using Ai.Tlbx.VoiceAssistant.Provider.OpenAi
@using Ai.Tlbx.VoiceAssistant.Provider.OpenAi.Models
@using Ai.Tlbx.VoiceAssistant.WebUi.Components
@using Ai.Tlbx.VoiceAssistant.Models
@inject VoiceAssistant voiceAssistant
@implements IDisposable
<PageTitle>Voice Assistant</PageTitle>
<div class="container mt-4">
<div class="row">
<div class="col-md-4">
<div class="card">
<div class="card-header">
<h5>Controls</h5>
</div>
<div class="card-body">
<div class="mb-3">
<AiTalkControl OnStartTalking="StartSession"
OnStopTalking="StopSession"
IsTalking="@voiceAssistant.IsRecording"
Loading="@voiceAssistant.IsConnecting" />
</div>
<div class="mb-3">
<label class="form-label">Voice</label>
<VoiceSelect SelectedVoice="@selectedVoice"
SelectedVoiceChanged="OnVoiceChanged"
Disabled="@(voiceAssistant.IsConnecting || voiceAssistant.IsRecording)" />
</div>
<div class="mb-3">
<VoiceSpeedSlider SelectedSpeed="@selectedSpeed"
SelectedSpeedChanged="OnSpeedChanged"
Disabled="@(voiceAssistant.IsConnecting || voiceAssistant.IsRecording)" />
</div>
<div class="mb-3">
<label class="form-label">Microphone</label>
<MicrophoneSelect AvailableMicrophones="@availableMicrophones"
@bind-SelectedMicrophoneId="@selectedMicrophoneId"
MicPermissionGranted="@micPermissionGranted"
OnRequestPermission="RequestMicrophonePermission"
Disabled="@(voiceAssistant.IsConnecting || voiceAssistant.IsRecording)" />
</div>
<div class="mb-3">
<StatusWidget ConnectionStatus="@voiceAssistant.ConnectionStatus"
Error="@voiceAssistant.LastErrorMessage"
IsMicrophoneTesting="@voiceAssistant.IsMicrophoneTesting" />
</div>
<button class="btn btn-secondary w-100"
@onclick="ClearChat"
disabled="@voiceAssistant.IsConnecting">
Clear Chat
</button>
</div>
</div>
</div>
<div class="col-md-8">
<div class="card">
<div class="card-header">
<h5>Conversation</h5>
</div>
<div class="card-body" style="height: 500px; overflow-y: auto;">
<ChatWidget />
</div>
</div>
</div>
</div>
</div>
@code {
private string selectedVoice = "alloy";
private double selectedSpeed = 1.0;
private string selectedMicrophoneId = string.Empty;
private bool micPermissionGranted = false;
private List<MicrophoneSelect.MicrophoneInfo> availableMicrophones = new();
protected override async Task OnInitializedAsync()
{
voiceAssistant.OnConnectionStatusChanged = OnConnectionStatusChanged;
voiceAssistant.OnMessageAdded = OnMessageAdded;
voiceAssistant.OnMicrophoneDevicesChanged = OnMicrophoneDevicesChanged;
}
protected override async Task OnAfterRenderAsync(bool firstRender)
{
if (firstRender)
{
await CheckMicrophonePermission();
}
}
private async Task CheckMicrophonePermission()
{
try
{
var mics = await voiceAssistant.GetAvailableMicrophonesAsync();
micPermissionGranted = mics.Count > 0 &&
mics.Any(m => !string.IsNullOrEmpty(m.Name) && !m.Name.StartsWith("Microphone "));
if (mics.Count > 0)
{
availableMicrophones = mics.Select(m => new MicrophoneSelect.MicrophoneInfo
{
Id = m.Id,
Name = m.Name,
IsDefault = m.IsDefault
}).ToList();
var defaultMic = availableMicrophones.FirstOrDefault(m => m.IsDefault);
if (defaultMic != null)
{
selectedMicrophoneId = defaultMic.Id;
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error checking microphone permission: {ex.Message}");
}
await InvokeAsync(StateHasChanged);
}
private async Task RequestMicrophonePermission()
{
try
{
var devices = await voiceAssistant.GetAvailableMicrophonesAsync();
availableMicrophones = devices.Select(m => new MicrophoneSelect.MicrophoneInfo
{
Id = m.Id,
Name = m.Name,
IsDefault = m.IsDefault
}).ToList();
micPermissionGranted = devices.Count > 0 &&
devices.Any(m => !string.IsNullOrEmpty(m.Name) && !m.Name.StartsWith("Microphone "));
if (micPermissionGranted && availableMicrophones.Count > 0)
{
var defaultMic = availableMicrophones.FirstOrDefault(m => m.IsDefault);
selectedMicrophoneId = defaultMic?.Id ?? availableMicrophones[0].Id;
}
}
catch (Exception ex)
{
Console.WriteLine($"Error requesting microphone permission: {ex.Message}");
}
await InvokeAsync(StateHasChanged);
}
private async Task StartSession()
{
try
{
var settings = new OpenAiVoiceSettings
{
Instructions = "You are a helpful AI assistant. Be friendly and conversational.",
Voice = Enum.Parse<AssistantVoice>(selectedVoice, true),
TalkingSpeed = selectedSpeed,
Model = OpenAiRealtimeModel.Gpt4oRealtimePreview20250603
};
await voiceAssistant.StartAsync(settings);
}
catch (Exception ex)
{
Console.WriteLine($"Error starting session: {ex.Message}");
}
}
private async Task StopSession()
{
try
{
await voiceAssistant.StopAsync();
}
catch (Exception ex)
{
Console.WriteLine($"Error stopping session: {ex.Message}");
}
}
private void ClearChat()
{
voiceAssistant.ClearChatHistory();
InvokeAsync(StateHasChanged);
}
private async Task OnVoiceChanged(string newVoice)
{
selectedVoice = newVoice;
await Task.CompletedTask;
}
private async Task OnSpeedChanged(double newSpeed)
{
selectedSpeed = newSpeed;
await Task.CompletedTask;
}
private void OnConnectionStatusChanged(string status)
{
InvokeAsync(StateHasChanged);
}
private void OnMessageAdded(ChatMessage message)
{
InvokeAsync(StateHasChanged);
}
private void OnMicrophoneDevicesChanged(List<AudioDeviceInfo> devices)
{
InvokeAsync(async () =>
{
availableMicrophones = devices.Select(m => new MicrophoneSelect.MicrophoneInfo
{
Id = m.Id,
Name = m.Name,
IsDefault = m.IsDefault
}).ToList();
StateHasChanged();
});
}
public void Dispose()
{
voiceAssistant.OnConnectionStatusChanged = null;
voiceAssistant.OnMessageAdded = null;
voiceAssistant.OnMicrophoneDevicesChanged = null;
}
}
Important Web-Specific Files
The Web implementation requires these JavaScript files to be placed in wwwroot/js/
:
- webAudioAccess.js - Main audio handling module
- audio-processor.js - Audio worklet processor for real-time audio
These are included in the Ai.Tlbx.VoiceAssistant.Hardware.Web
package and will be automatically copied to your project.
Architecture Overview
Component Hierarchy
VoiceAssistant (Orchestrator)
├── IVoiceProvider (AI Provider Interface)
│ └── OpenAiVoiceProvider
├── IAudioHardwareAccess (Platform Interface)
│ ├── WindowsAudioAccess
│ ├── LinuxAudioAccess
│ └── WebAudioAccess
└── ChatHistoryManager (Conversation State)
Key Interfaces
IVoiceProvider
public interface IVoiceProvider : IAsyncDisposable
{
bool IsConnected { get; }
Task ConnectAsync(IVoiceSettings settings);
Task DisconnectAsync();
Task ProcessAudioAsync(string base64Audio);
Task SendInterruptAsync();
Task InjectConversationHistoryAsync(IEnumerable<ChatMessage> messages);
// Callbacks
Action<ChatMessage>? OnMessageReceived { get; set; }
Action<string>? OnAudioReceived { get; set; }
Action<string>? OnStatusChanged { get; set; }
Action<string>? OnError { get; set; }
Action? OnInterruptDetected { get; set; }
}
IAudioHardwareAccess
public interface IAudioHardwareAccess : IAsyncDisposable
{
Task InitAudio();
Task<bool> StartRecordingAudio(MicrophoneAudioReceivedEventHandler audioDataReceivedHandler);
Task<bool> StopRecordingAudio();
bool PlayAudio(string base64EncodedPcm16Audio, int sampleRate = 24000);
Task ClearAudioQueue();
Task<List<AudioDeviceInfo>> GetAvailableMicrophones();
Task<bool> SetMicrophoneDevice(string deviceId);
void SetLogAction(Action<LogLevel, string> logAction);
}
API Reference
VoiceAssistant Class
Main orchestrator for voice interactions.
Properties
bool IsRecording
- Indicates if currently recording audiobool IsConnecting
- Indicates if connecting to AI providerbool IsMicrophoneTesting
- Indicates if microphone test is runningstring ConnectionStatus
- Current connection status messagestring LastErrorMessage
- Last error message if any
Methods
Task StartAsync(IVoiceSettings settings)
- Start voice assistant sessionTask StopAsync()
- Stop current sessionTask InterruptAsync()
- Interrupt current AI responseTask<List<AudioDeviceInfo>> GetAvailableMicrophonesAsync()
- Get available microphonesTask TestMicrophoneAsync()
- Test microphone with beep playbackvoid ClearChatHistory()
- Clear conversation history
Events
Action<ChatMessage> OnMessageAdded
- Fired when message is added to chatAction<string> OnConnectionStatusChanged
- Fired when connection status changesAction<List<AudioDeviceInfo>> OnMicrophoneDevicesChanged
- Fired when mic list changes
OpenAiVoiceSettings
Configuration for OpenAI provider.
public class OpenAiVoiceSettings : IVoiceSettings
{
public string Instructions { get; set; }
public AssistantVoice Voice { get; set; } = AssistantVoice.Alloy;
public double TalkingSpeed { get; set; } = 1.0;
public List<IVoiceTool> Tools { get; set; } = new();
public OpenAiRealtimeModel Model { get; set; } = OpenAiRealtimeModel.Gpt4oRealtimePreview20250603;
public double? Temperature { get; set; }
public int? MaxTokens { get; set; }
}
Voice Options
public enum AssistantVoice
{
Alloy,
Echo,
Fable,
Onyx,
Nova,
Shimmer
}
Model Options
public enum OpenAiRealtimeModel
{
Gpt4oRealtimePreview20250603 = 0, // Latest (June 2025) - Recommended
Gpt4oRealtimePreview20241217 = 1, // December 2024 - Stable
Gpt4oRealtimePreview20241001 = 2, // October 2024 - Legacy
Gpt4oMiniRealtimePreview20241217 = 3, // Mini model - Lower latency
}
Advanced Topics
Custom Tools
Implement custom tools for AI capabilities:
public class WeatherTool : IVoiceTool
{
public string Name => "get_weather";
public string Description => "Get current weather for a location";
public ToolParameterSchema GetParameterSchema()
{
return new ToolParameterSchema
{
Type = "object",
Properties = new Dictionary<string, ToolProperty>
{
["location"] = new ToolProperty
{
Type = "string",
Description = "City name"
}
},
Required = new[] { "location" }
};
}
public async Task<string> ExecuteAsync(string arguments)
{
var args = JsonSerializer.Deserialize<Dictionary<string, string>>(arguments);
var location = args["location"];
// Implement weather API call
return $"The weather in {location} is sunny and 72°F";
}
}
// Use in settings
settings.Tools.Add(new WeatherTool());
Logging Configuration
The toolkit uses a centralized logging architecture. Configure logging at the orchestrator level:
services.AddVoiceAssistant()
.WithOpenAi(apiKey: "...")
.WithHardware<WindowsAudioDevice>()
.WithLogging((level, message) =>
{
// Custom logging logic
Console.WriteLine($"[{level}] {message}");
});
Conversation History
The assistant maintains conversation history across sessions:
// History is automatically injected when starting new sessions
// To manually manage history:
var messages = voiceAssistant.ChatHistory.GetMessages();
// Clear history
voiceAssistant.ClearChatHistory();
Error Handling
voiceAssistant.OnConnectionStatusChanged = (status) =>
{
if (status.Contains("error", StringComparison.OrdinalIgnoreCase))
{
// Handle error
var error = voiceAssistant.LastErrorMessage;
Console.WriteLine($"Error occurred: {error}");
}
};
// Also handle provider errors
try
{
await voiceAssistant.StartAsync(settings);
}
catch (InvalidOperationException ex)
{
// Handle initialization errors
Console.WriteLine($"Failed to start: {ex.Message}");
}
Troubleshooting
Common Issues
Windows
Issue: "No microphones found"
- Solution: Check Windows privacy settings for microphone access
- Run as Administrator if needed
- Ensure audio drivers are installed
Issue: "NAudio initialization failed"
- Solution: Install Windows audio drivers
- Check Windows Audio service is running
Linux
Issue: "ALSA lib not found"
- Solution: Install ALSA libraries
sudo apt-get update sudo apt-get install libasound2-dev
Issue: "Permission denied accessing audio device"
- Solution: Add user to audio group
sudo usermod -a -G audio $USER # Log out and back in
Web/Blazor
Issue: "Microphone permission denied"
- Solution:
- Ensure HTTPS or localhost
- Browser must support getUserMedia API
- User must grant permission when prompted
Issue: "Audio worklet failed to load"
- Solution:
- Ensure JavaScript files are in wwwroot/js/
- Check browser console for errors
- Verify HTTPS is enabled
Issue: "Bluetooth headset switches to hands-free mode"
- Solution: The toolkit now prevents this by:
- Using higher sample rates (48kHz)
- Deferring AudioContext creation
- Proper constraint configuration
Debug Logging
Enable detailed logging for troubleshooting:
services.AddLogging(builder =>
{
builder.SetMinimumLevel(LogLevel.Debug)
.AddConsole()
.AddDebug();
});
// For web hardware, enable JavaScript diagnostics
var hardware = serviceProvider.GetRequiredService<IAudioHardwareAccess>();
if (hardware is WebAudioAccess webAccess)
{
await webAccess.SetDiagnosticLevel(DiagnosticLevel.Verbose);
}
Migration from v3.x
Version 4.0 introduces breaking changes from v3.x:
1. Package Name Changes
- Old:
Ai.Tlbx.RealTimeAudio.*
- New:
Ai.Tlbx.VoiceAssistant.*
2. Architecture Changes
OpenAiRealTimeApiAccess
replaced byVoiceAssistant
+OpenAiVoiceProvider
- Event-based callbacks replaced with
Action
properties - New dependency injection pattern
3. Code Migration Example
v3.x Code:
var openAiAccess = new OpenAiRealTimeApiAccess(apiKey);
openAiAccess.MessageReceived += OnMessageReceived;
await openAiAccess.ConnectAsync();
v4.0 Code:
services.AddVoiceAssistant()
.WithOpenAi(apiKey)
.WithHardware<WindowsAudioDevice>();
var voiceAssistant = serviceProvider.GetRequiredService<VoiceAssistant>();
voiceAssistant.OnMessageAdded = OnMessageAdded;
await voiceAssistant.StartAsync(settings);
GitHub Repository
https://github.com/AiTlbx/Ai.Tlbx.VoiceAssistant
Contributing
Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
For issues, questions, or contributions, please visit our GitHub repository.
Acknowledgments
- Built on top of NAudio for Windows audio
- Uses ALSA for Linux audio support
- Leverages Web Audio API for browser-based audio
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net9.0
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.0)
- Microsoft.Extensions.Options (>= 9.0.0)
NuGet packages (5)
Showing the top 5 NuGet packages that depend on Ai.Tlbx.VoiceAssistant:
Package | Downloads |
---|---|
Ai.Tlbx.VoiceAssistant.Hardware.Web
Web-specific audio provider for Voice Assistant toolkit |
|
Ai.Tlbx.VoiceAssistant.WebUi
Blazor UI components for Voice Assistant toolkit |
|
Ai.Tlbx.VoiceAssistant.Hardware.Linux
Linux-specific hardware integration for voice assistant audio processing |
|
Ai.Tlbx.VoiceAssistant.Hardware.Windows
Windows-specific hardware integration for voice assistant audio processing |
|
Ai.Tlbx.VoiceAssistant.Provider.OpenAi
OpenAI provider implementation for the Voice Assistant toolkit. Enables real-time conversation with OpenAI's GPT models through WebSocket connections. |
GitHub repositories
This package is not used by any popular GitHub repositories.