SSMD Editing
SSMD (Speech Synthesis Markdown) is an intermediate text format used by ttsforge between EPUB extraction and TTS audio generation. It allows fine-grained control over pronunciation, pacing, and emphasis in your audiobooks.
How SSMD Works
During conversion, ttsforge automatically generates .ssmd files for each chapter:
.{book_title}_chapters/
├── {book_title}_state.json
├── chapter_001_intro.ssmd # Editable text with speech markup
├── chapter_001_intro.wav
├── chapter_002_chapter1.ssmd
├── chapter_002_chapter1.wav
└── ...
When you resume a conversion, ttsforge detects if you’ve edited any SSMD files (using MD5 hash comparison) and automatically regenerates the corresponding audio.
Basic Workflow
Start conversion:
ttsforge convert book.epub
Pause conversion (Ctrl+C when needed)
Edit SSMD files to fix pronunciation or pacing:
vim .book_chapters/chapter_001_intro.ssmdResume conversion - automatically detects edits:
ttsforge convert book.epub
SSMD Syntax
SSMD uses a simple markdown-like syntax for speech control.
Structural Breaks
Control pauses between text segments:
...p # Paragraph break (0.5-1.0s pause)
...s # Sentence break (0.1-0.3s pause)
...c # Clause break (shorter pause)
Example:
This is the first paragraph. ...s
It has multiple sentences. ...p
This is a second paragraph. ...s
Emphasis
Add vocal emphasis to words or phrases:
*text* # Moderate emphasis
**text** # Strong emphasis
Example:
Harry was a *highly unusual* boy. ...s
He **hated** the summer holidays. ...s
Custom Phonemes
Override pronunciation using IPA phonemes:
[word]{ph="phoneme"}
Examples:
[Hermione]{ph="hɝmˈIni"} Granger was Harry's best friend. ...s
The [API]{ph="ˌeɪpiˈaɪ"} supports [JSON]{ph="dʒˈeɪsɑn"}. ...s
[Kubernetes]{ph="kubɚnˈɛtɪs"} is a container orchestrator. ...s
Language Switching (Planned)
Mark text as a different language (placeholder for future):
[Bonjour]{lang="fr"} # French text
[Hola]{lang="es"} # Spanish text
Complete Example
Here’s a complete SSMD file example:
Chapter One ...p
[Harry]{ph="hæɹi"} Potter was a *highly unusual* boy in many ways. ...s
For one thing, he **hated** the summer holidays more than any other
time of year. ...s For another, he really wanted to do his homework,
but was forced to do it in secret, in the dead of the night. ...p
And he also happened to be a wizard. ...p
The [Dursleys]{ph="dɝzliz"} had everything they wanted, but they
also had a secret. ...s And their greatest fear was that somebody
would discover it. ...p
Automatic Features
SSMD files are automatically enhanced with:
Phoneme Dictionary Injection
If you use --phoneme-dict, all phoneme substitutions are automatically
injected into the SSMD:
ttsforge convert book.epub --phoneme-dict custom_phonemes.json
The generated SSMD will include:
[Hermione]{ph="hɝmˈIni"} loved reading books. ...s
HTML Emphasis Detection
Emphasis from the original EPUB HTML is automatically converted:
<em>text</em>→*text*<strong>text</strong>→**text**
Structural Break Preservation
ttsforge preserves paragraph structure but does not insert explicit
...p or ...s markers. Sentence detection is handled internally by
pykokoro at synthesis time. Use manual break markers only when you need
precise control over pauses.
Use Cases
When to Edit SSMD
Pronunciation issues: Character names, technical terms, foreign words
Pacing problems: Adjust paragraph and sentence breaks for better flow
Emphasis corrections: Add or remove emphasis on specific words
Consistency: Ensure consistent pronunciation across chapters
Combining with Phoneme Dictionary
For best results, use both features together:
Create a phoneme dictionary for common names/terms
Let ttsforge auto-inject into SSMD
Edit SSMD files for chapter-specific tweaks
# 1. Extract and review names
ttsforge extract-names book.epub
vim custom_phonemes.json
# 2. Start conversion (phonemes auto-injected into SSMD)
ttsforge convert book.epub --phoneme-dict custom_phonemes.json
# 3. Edit specific SSMD files as needed
vim .book_chapters/chapter_005.ssmd
# 4. Resume (regenerates edited chapters)
ttsforge convert book.epub
Tips and Best Practices
Start with phoneme dictionary: Create a global dictionary first, then use SSMD for chapter-specific overrides
Test edits incrementally: Edit one chapter, let it regenerate, listen to verify before editing more
Use emphasis sparingly: Too much emphasis can sound unnatural
Keep backups: SSMD files are regenerated if missing, but manual edits are preserved
Consistent phonemes: Use the same IPA notation throughout for consistency
Technical Details
Hash-Based Change Detection
ttsforge tracks SSMD file changes using MD5 hashes (12 characters) stored in the state file. When you resume:
Current SSMD file is hashed
Compared with saved hash in state
If different, audio is regenerated
New hash is saved
File Format
SSMD files are plain text UTF-8 files with the .ssmd extension.
They can be edited with any text editor.
Error Handling
If SSMD generation fails, ttsforge falls back to plain text conversion and logs a warning. The conversion continues without SSMD features.
Validation
SSMD is not automatically validated during conversion. For manual checks,
use the validate_ssmd helper from ttsforge.ssmd_generator to get
warnings about unbalanced markers before you synthesize.
Limitations
Language switching is not yet implemented (planned feature)
Phoneme syntax must use valid IPA characters
Very long lines may be truncated in some editors
Hash detection only works with resumable conversions
See Also
Quick Start Guide - Getting started with ttsforge
CLI Reference - Complete command reference
Configuration - Configuration options
For more SSMD examples and a quick reference, see SSMD_QUICKSTART.md
in the repository root.