Skip to main content
Ctrl+K
VoxServe v0.0.1 Documentation - Home VoxServe v0.0.1 Documentation - Home

Getting Started

  • Quickstart
  • Core Concepts

Usage Guides

  • Usage Guides
    • Orpheus
    • CSM
    • Zonos
    • Chatterbox
    • CosyVoice2
    • Qwen3-TTS
    • GLM-4-Voice
    • Step-Audio-2

Reference

  • API Reference
  • CLI Reference
  • Supported Models
  • Architecture

Contributing

  • Development
  • Repository
  • Show source
  • Suggest edit
  • Open issue
  • .rst

VoxServe Documentation

VoxServe Documentation#

PyPI arXiv GitHub Stars

VoxServe is a streaming-centric serving system for Speech Language Models (SpeechLMs), supporting both text-to-speech (TTS) and speech-to-speech (STS) workloads.

Getting Started

  • Quickstart
    • Prerequisites
    • Install
    • Run the server
    • Send a request
    • Notes
  • Core Concepts
    • Overview
    • Motivation
    • Innovation

Usage Guides

  • Usage Guides
    • Text-to-Speech (TTS) Models
    • Speech-to-Speech (STS) Models

Reference

  • API Reference
    • Base URL
    • POST /generate
    • GET /health
  • Python API
    • Package overview
    • Key modules
  • CLI Reference
    • Usage
    • Arguments
  • Supported Models
    • Text-to-speech (TTS)
    • Speech-to-speech (STS)
    • Notes
  • Architecture
    • High-level components
    • Repository map
    • Execution flow (simplified)

Contributing

  • Development

next

Quickstart

By VoxServe Team

© Copyright 2025-2026, VoxServe Team.

Last updated on Apr 13, 2026.