Abogen – Audiobook Generator for EPUB, PDF, and Text

Abogen – Audiobook Generator for EPUB, PDF, and Text

Category: Deep Learning
License: MIT
Model Type: Speech Synthesis
Abogen is a powerful tool that converts e-books (EPUB, PDF) or plain text into high-quality, natural-sounding audio with synchronized subtitles. Built for speed and ease, it uses the Kokoro‑82M TTS model to generate enhanced audio and subtitle outputs quickly and accurately.

Key Features

  • Supports multiple input formats: EPUB, PDF, and plain text (with built-in editor).
  • Subtitle synchronization: Generates time-aligned captions at sentence, word, or fixed-word intervals.
  • Voice and speed customization: Offers voice selection (e.g. male/female, language variants) and adjustable speech speed (0.1×–2.0×).
  • Audio export options: Outputs in WAV, FLAC, MP3, OPUS, and M4B formats (with chapters).
  • Chapter handling: Detects chapters in e-books or uses chapter markers in text files; can export chapters individually.
  • Voice mixing: Blend multiple voice models with adjustable weights to create custom voice profiles.
  • Cross-platform installers and Docker support: Includes installer scripts for Windows, Linux, macOS, plus Docker deployment.
  • Demo video creation: Integrates FFmpeg for easy export of subtitle-overlaid video.
  • Multilingual voices: Supports English (US/UK), Spanish, French, Hindi, Italian, Japanese, Portuguese, and Chinese.