From 762815a5b78e06de31d7f5274977c85ab20f22eb Mon Sep 17 00:00:00 2001 From: Labmem-Zhouyx Date: Fri, 5 Dec 2025 23:57:43 +0800 Subject: [PATCH] Update: user guides --- docs/usage_guide.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/usage_guide.md b/docs/usage_guide.md index 0862d59..e902eee 100644 --- a/docs/usage_guide.md +++ b/docs/usage_guide.md @@ -23,8 +23,10 @@ This is the secret sauce that gives your audio its unique sound. ### 1. Cooking with a Prompt Speech (Following a Famous Recipe) - A prompt speech provides the desired acoustic characteristics for VoxCPM. The speaker's timbre, speaking style, and even the background sounds and ambiance will be replicated. -- **For a Clean, Studio-Quality Voice:** - - ✅ Enable "Prompt Speech Enhancement". This acts like a noise filter, removing background hiss and rumble to give you a pure, clean voice clone. +- **For a Clean, Denoising Voice:** + - ✅ Enable "Prompt Speech Enhancement". This acts like a noise filter, removing background hiss and rumble to give you a pure, clean voice clone. However, this will limit the audio sampling rate to 16kHz, restricting the cloning quality ceiling. +- **For High-Quality Audio Cloning (Up to 44.1kHz):** + - ❌ Disable "Prompt Speech Enhancement" to preserve all original audio information, including background atmosphere, and support audio cloning up to 44.1kHz sampling rate. ### 2. Cooking au Naturel (Letting the Model Improvise) - If no reference is provided, VoxCPM becomes a creative chef! It will infer a fitting speaking style based on the text itself, thanks to the text-smartness of its foundation model, MiniCPM-4.