Text‑To‑Audio ChatGPT

Text‑To‑Audio ChatGPT

Category: Other
License: MIT
Model Type: Speech Synthesis
This project integrates the AudioLDM model into a web interface powered by ChatGPT and Gradio. It enables users to input text and receive generated audio—speech or sound effects—based on their prompts. It’s designed to streamline text-to-audio generation in a conversational, browser-based environment.

Key Features

  • ChatGPT-style conversational interface for text-to-audio generation
  • Generates both voice and musical/sound effect audio
  • Built on AudioLDM backend
  • Supports launching via local web server (Gradio-based)
  • Includes example scripts and notebook for experimentation
  • Easy setup: environment installation, script run, and browser launch
  • Stores generated outputs such as MP3 or WAV in local folders