This project was forked from NaturalVoiceSAPIAdapter and is now developed at AACTools/VoiceGarden-SAPI.
A SAPI 5 text-to-speech (TTS) engine that connects 20+ cloud TTS engines and offline SherpaOnnx models to any Windows application that supports SAPI voices — including Grid 3, The Grid, Clicker, and any software using System.Speech.
| Category | Engines | Cloud? |
|---|---|---|
| Offline neural | SherpaOnnx (Kokoro, Piper, MMS, VITS, Matcha) | No — fully local |
| Microsoft | Azure Cognitive Services, Edge browser voices | Yes |
| Cloud TTS | OpenAI, Google Cloud, AWS Polly, ElevenLabs, Cartesia, Deepgram | Yes |
| More cloud | Watson, PlayHT, Wit.ai, Gemini, Hume AI, xAI Grok, Fish Audio, Mistral, Murf, Unreal Speech, Resemble, Uplift AI, Models Lab | Yes |
Any SAPI 5-compatible program can use these voices. Offline SherpaOnnx voices require no internet connection.
- Download the MSI from the Releases section.
- Run
setup.exe(or the.msidirectly). - After installation, VoiceGarden.UI.exe launches automatically — this is the configuration app.
- Go to the SherpaOnnx tab to download offline models, or the Engine Config tab to enter cloud API keys.
- Click Install to SAPI to promote voices to the Windows registry.
- Restart your SAPI application (e.g., Grid 3) — the new voices appear in the voice list.
VoiceGarden.UI.exe install # registers 64-bit DLL
VoiceGarden.UI.exe install32 # registers 32-bit DLL (for 32-bit apps)
VoiceGarden.UI.exe doubles as a command-line tool:
VoiceGarden.UI.exe status # show registration status
VoiceGarden.UI.exe voices # list all SAPI voices
VoiceGarden.UI.exe models list # list downloaded SherpaOnnx models
VoiceGarden.UI.exe models download <id> # download a model
VoiceGarden.UI.exe models promote-all # promote all models to SAPI (needs admin)
VoiceGarden.UI.exe promote --engine google --voice en-US-Wavenet-D --key API_KEY
VoiceGarden.UI.exe validate --engine azure --voice en-US-JennyNeural --key KEY --region uksouth
- OS: Windows 7 SP1 or later (32-bit and 64-bit supported)
- Runtime: .NET 8.0 desktop runtime (for VoiceGarden.UI)
- For SherpaOnnx extraction: 7-Zip (for
.tar.bz2model archives) - For cloud voices: Internet access + valid API key for the respective service
┌──────────────────────────────────────────────────────────┐
│ SAPI Application (Grid 3, System.Speech, etc.) │
│ │ SAPI 5 COM │
│ ┌─────────────────▼──────────────────────────────────┐ │
│ │ VoiceGardenSAPIAdapter.dll (C++ COM DLL) │ │
│ │ ┌──────────────┐ ┌────────────┐ ┌────────────┐ │ │
│ │ │ SherpaOnnx │ │ Azure REST │ │GenericHttp │ │ │
│ │ │ (offline) │ │ + SDK shim │ │(OpenAI etc)│ │ │
│ │ └──────────────┘ └────────────┘ └────────────┘ │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ VoiceGarden.UI.exe (Avalonia .NET 8 app) │ │
│ │ • Download/promote SherpaOnnx models │ │
│ │ • Configure cloud engine credentials │ │
│ │ • Register/unregister 32-bit + 64-bit DLLs │ │
│ │ • Preview voices │ │
│ └─────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
| Component | Description |
|---|---|
VoiceGardenSAPIAdapter/ |
C++ SAPI COM DLL — the actual TTS engine. Handles SherpaOnnx (offline), Azure REST/SDK, and generic HTTP TTS |
VoiceGarden.UI/ |
Avalonia UI app — main configuration tool. Replaces the old C++ Installer |
VoiceGardenSAPIAdapter.Net/ |
.NET SAPI adapter (alternative runtime, currently inactive) |
SherpaOnnxConfig/ |
CLI tool for SherpaOnnx model management |
EngineConfig/ |
CLI tool for cloud engine voice management |
Setup/ + SetupLauncher/ |
WiX MSI package + setup.exe bootstrapper |
SherpaOnnx provides fully offline neural TTS — no internet, no API keys, no cloud dependency.
- Kokoro — high-quality English models with multiple voices (
voices.bin) - Piper — fast, lightweight VITS models (many languages)
- MMS — Meta's Massively Multilingual Speech (1000+ languages)
- VITS — standard VITS models
- Matcha — matcha-TTS models with vocoder
Models are stored in %LOCALAPPDATA%\VoiceGardenSAPIAdapter\models\. Download via:
- UI: SherpaOnnx tab → select models → Download Selected
- CLI:
VoiceGarden.UI.exe models download kokoro-en-en-19
The system auto-detects model type from directory contents:
voices.binpresent → Kokoro (type 2)vocoder.onnxpresent → Matcha (type 1)- Otherwise → VITS (type 0)
Grid 3 (System.Speech) only selects voices from HKLM registry tokens — in-memory enumerator voices won't work with SelectVoice. Use Install to SAPI to promote voices to HKLM.
See docs/troubleshooting-grid3-voice-activation.md for detailed Grid 3 troubleshooting.
- Get a key from Azure Portal → Speech service → Keys and Endpoint
- In VoiceGarden.UI → Engine Config → Azure: enter key and region (e.g.,
uksouth) - Promote voices to SAPI
- Get an API key from Google Cloud Console → APIs & Services → Credentials
- In VoiceGarden.UI → Engine Config → Google: enter key
- Promote a specific voice (e.g.,
en-US-Wavenet-D)
Each cloud engine needs its API key set in the Engine Config tab. The C++ adapter handles REST calls via GenericHttpTts for OpenAI, ElevenLabs, Google, Cartesia, and Deepgram. Azure and Edge voices use the dedicated REST/WebSocket path.
.\scripts\test-sherpa-e2e.ps1 # downloads models, promotes, speaks via System.Speech.\scripts\cleanup-voices.ps1 # removes Sherpa/Cloud/eSpeak tokens
.\scripts\cleanup-voices.ps1 -DryRun # preview without deletingAdd-Type -AssemblyName "System.Speech"
$s = New-Object System.Speech.Synthesis.SpeechSynthesizer
$s.SelectVoice("kokoro-en-en-19") # or any voice name
$s.Speak("Hello, this is a test.")- Visual Studio 2022 with C++ (MSVC v143)
- .NET 8.0 SDK
- WiX Toolset v3 (for MSI)
- 7-Zip (for SherpaOnnx deps extraction)
.\download-sherpa-deps.ps1 -Platforms x64,x86 # one-time: download SherpaOnnx native DLLs
.\scripts\build-release-local.ps1 -Configuration Release -Platforms x64,x86 -BuildSetup -SkipSherpaDeps -SkipSubmodulesOutput: installer-output\VoiceGardenSAPIAdapter.msi
GitHub Actions workflow at .github/workflows/msbuild.yml builds all components on every push:
- C++ adapter DLL (x86, x64, ARM64)
- VoiceGarden.UI (Avalonia app)
- SherpaOnnxConfig + EngineConfig CLI tools
- .NET adapter
- MSI setup package
See the wiki page on configurable registry values for advanced settings including:
NoEdgeVoices,NoAzureVoices,NoSherpaVoices— toggle voice categoriesAzureVoiceKey,AzureVoiceRegion— Azure credentialsEdgeVoiceLanguages— filter Edge voices by languageErrorMode— control error handling behavior
- SherpaOnnx — offline neural TTS
- DotNetTtsWrapper — .NET TTS client for 20+ engines
- Avalonia UI — cross-platform .NET UI framework
- CommunityToolkit.Mvvm — MVVM framework
- Microsoft.CognitiveServices.Speech — Azure Speech SDK
- websocketpp — WebSocket client (Edge voices)
- OpenSSL — HTTPS for cloud TTS
- nlohmann/json — JSON parsing
- spdlog — logging
- YY-Thunks — Windows XP compatibility
- Detours — API hooking