Seerah Legacy

Text‑To‑Audio ChatGPT

Text‑To‑Audio ChatGPT

Category: Other

License: MIT

Model Type: Speech Synthesis

This project integrates the AudioLDM model into a web interface powered by ChatGPT and Gradio. It enables users to input text and receive generated audio—speech or sound effects—based on their prompts. It’s designed to streamline text-to-audio generation in a conversational, browser-based environment.

Key Features

ChatGPT-style conversational interface for text-to-audio generation
Generates both voice and musical/sound effect audio
Built on AudioLDM backend
Supports launching via local web server (Gradio-based)
Includes example scripts and notebook for experimentation
Easy setup: environment installation, script run, and browser launch
Stores generated outputs such as MP3 or WAV in local folders

GitHub

Similar Projects

Chat‑With‑PDF

Chat‑With‑PDF

SoundCTM-DiT: Unified Score-Based & Consistency Models for Full-Band Text-to-Sound

SoundCTM-DiT: Unified Score-Based & Consistency Models for Full-Band Text-to-Sound

KnpSnappyBundle: Seamless PDF and Image Generation in Symfony via wkhtmltopdf

KnpSnappyBundle: Seamless PDF and Image Generation in Symfony via wkhtmltopdf

Python Text-to-Speech CLI Tool

Python Text-to-Speech CLI Tool

ElevenLabs Clone

ElevenLabs Clone

pdfGPT – Chat with Your PDFs using LLMs

pdfGPT – Chat with Your PDFs using LLMs