MMAudio Web UI

MMAudio Web UI

Category: Other
License: MIT
Model Type: Speech Synthesis
A simple web user interface built on top of the MMAudio library, enabling both video-to-audio and text-to-audio generation through an interactive browser interface. Designed for easy local deployment and experimentation in Python environments.

Key Features

  • Video‑to‑audio conversion using MMAudio
  • Text‑to‑audio generation via MMAudio library
  • Minimalistic browser-based UI built with Gradio or Flask
  • Automatic download of required model weights (MMAudio, CLIP, Motion encoder, BigVGAN vocoder)
  • Quick startup through a single script
  • Python 3 environment ready, including virtual environment support
  • Lightweight dependency list via requirements file

Project Screenshots

Project Screenshot