Documentation

OuteAI API & SDK

Everything you need to generate studio-quality speech — from the browser with Studio, or programmatically with the API.

Introduction

OuteAI provides high-quality text-to-speech generation through two interfaces:

  • 1 Studio — a browser-based interface for generating and managing audio. No code required. Great for one-off generations, content production, and voice experimentation.
  • 2 API / SDK — a REST API with an official Python SDK for building applications, automating workflows, or integrating TTS into your product.

Both interfaces use the same underlying model, voice library, and credits balance. Audio is billed per second of generated audio at 0.001 credits / second.

GPU Cold Starts

Your first request may be slow. OuteAI runs on GPU infrastructure that scales to zero when idle. If no generation has happened recently, the first request will wait for the GPU to spin up — this typically adds 10–40 seconds of latency before audio begins.

This applies to both Studio and the API. Subsequent requests within the same active window are fast.

In practice: if you are building a latency-sensitive application, consider sending a short warm-up request (e.g., a single word) when your application starts, then making real requests immediately after. You will still be billed normally for the warm-up audio.

Authentication

All API requests are authenticated with a Bearer token. You can create and manage API tokens from your Account page.

Authorization: Bearer oute_xxxxxxxxxxxxxxxxxxxxxxxx
API tokens are account-scoped. Keep them secret. You can revoke any token at any time from your account page.

Studio

Using Studio

Studio is a browser-based interface at /studio. Sign in to get started — no setup required.

1. Select a voice

Choose from built-in voices or any voice clone you have created. Voices are shown in the composer at the bottom of the page.

2. Type your text

Enter your text in the composer. Studio supports up to 5,000 characters per generation. Press Enter to generate, or Shift+Enter for a new line.

3. Play, download, or continue

Generated audio appears in your session feed with an inline waveform player. Sessions persist — come back any time to continue generating or download previous audio.

The first generation in a cold session may take noticeably longer due to GPU cold starts. See GPU Cold Starts above.

Voice Cloning

Create a custom voice from a reference audio sample in the Voice Clone section of Studio.

Audio requirements

RequirementValue
FormatWAV or MP3
Duration5 – 10 seconds
Max file size10 MB
SpeakerSingle speaker, clean audio, minimal background noise

Once cloned, your voice appears in Studio's voice selector and is available via the API using its UUID. Clones can be deleted at any time from the Voice Clone page.

Voice cloning is billed at 0.025 credits / voice. Clones are stored until you delete them.

Python SDK

Installation

Install the outeai package from PyPI. Python 3.9+ is required.

# Synchronous client only
pip install outeai

# Synchronous + async client (installs httpx)
pip install "outeai[async]"

Quick Start

List available voices, generate speech, and save it to a file.

from outeai import OuteAI

# Initialise the client — use a context manager to auto-close
with OuteAI("oute_xxxxxxxxxxxxxxxxxxxxxxxx") as client:

    # List all voices available on your account
    voices = client.list_voices()
    for v in voices:
        print(v["voice_id"], v["speaker_name"])

    # Generate speech (non-streaming, returns full WAV)
    audio = client.generate_speech(
        text="Hello, world! This is OuteAI.",
        voice_id=voices[0]["voice_id"],
    )

    # Save to a WAV file
    audio.save("hello.wav")

Generate and save in one call

client.generate_speech_to_file(
    "output.wav",
    text="Saving directly to a file.",
    voice_id="your-voice-uuid",
)

SpeechResult object

generate_speech() returns a SpeechResult with the following attributes:

AttributeTypeDescription
wav_bytesbytesRaw WAV file bytes
content_typestrAlways audio/wav
sample_rateint24000 Hz
channelsint1 (mono)
sample_widthint2 bytes (16-bit PCM)

Streaming

Use stream_speech() to receive audio chunks as they are generated. This is useful for real-time playback or piping audio to another process.

with client.stream_speech(
    text="Streaming audio chunk by chunk.",
    voice_id="your-voice-uuid",
) as stream:
    # Iterate over raw WAV chunks
    for chunk in stream:
        my_audio_pipe.write(chunk)

Stream directly to a file

with client.stream_speech(
    text="Saving a stream.",
    voice_id="your-voice-uuid",
) as stream:
    stream.save("streamed.wav")

# Or use the one-liner shorthand
client.stream_speech_to_file(
    "streamed.wav",
    text="Saving a stream.",
    voice_id="your-voice-uuid",
)

Async Client

Install with async support and use AsyncOuteAI for non-blocking generation in async applications.

pip install "outeai[async]"
import asyncio
from outeai import AsyncOuteAI

async def main():
    async with AsyncOuteAI("oute_xxxxxxxxxxxxxxxxxxxxxxxx") as client:
        voices = await client.list_voices()

        audio = await client.generate_speech(
            text="Hello from async!",
            voice_id=voices[0]["voice_id"],
        )
        audio.save("async-output.wav")

asyncio.run(main())

Async streaming

async with client.stream_speech(
    text="Async streaming example.",
    voice_id="your-voice-uuid",
) as stream:
    async for chunk in stream:
        my_pipe.write(chunk)

# Or save directly
await client.stream_speech_to_file(
    "streamed.wav",
    text="Async streaming to file.",
    voice_id="your-voice-uuid",
)

Error Handling

The SDK raises two exception types:

from outeai import OuteAI, OuteAIError, OuteAIAPIError

try:
    audio = client.generate_speech(text="Hello", voice_id="...")
except OuteAIAPIError as e:
    # HTTP error returned by the API
    print(e.status_code)  # e.g. 402, 429, 400
    print(e.code)         # e.g. "INSUFFICIENT_CREDITS"
    print(e.message)      # human-readable description
except OuteAIError as e:
    # Network error or SDK-level validation failure
    print(e)
ExceptionWhen raised
OuteAIAPIError The API returned a non-2xx response. Has .status_code, .code, and .message.
OuteAIError Base class. Raised for network failures, invalid arguments, or missing optional dependencies.

HTTP API Reference

Base URL: https://outeai.com. All endpoints require Authorization: Bearer <token>.

List Voices

GET /api/v1/voices Returns all voices available on your account

Returns built-in voices plus any voice clones you have created. Each voice includes a voice_id (UUID) used when generating speech.

Example response
{
    "success": True,
    "data": {
        "voices": [
            {
                "voice_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
                "speaker_name": "My Cloned Voice",
                "is_clone": True,
            }
        ]
    },
}

Generate Speech

POST /api/v1/tts_non_stream Returns a complete WAV file

Generates speech and returns the complete WAV file once synthesis is done. Suitable for short texts or when you need the full file before playback.

Request body (JSON)
FieldTypeNotes
text required string Text to synthesise. Also accepts input (OpenAI-compatible).
voice_id required string UUID of the voice from GET /api/v1/voices. Also accepts voice.
model_name required string Model identifier. Currently: outetts_1_pro. Also accepts model.
response_format optional string Must be wav. Default: wav.
Example request
curl -X POST https://outeai.com/api/v1/tts_non_stream \
  -H "Authorization: Bearer oute_xxxx" \
  -H "Content-Type: application/json" \
  -H "Accept: audio/wav" \
  -d '{
    "text": "Hello, world!",
    "voice_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
    "model_name": "outetts_1_pro"
  }' \
  --output hello.wav

Returns Content-Type: audio/wav binary data on success.

Streaming Speech

POST /api/v1/tts_stream Returns audio as a chunked WAV stream

Same request body as tts_non_stream. The response is a chunked HTTP transfer where audio data arrives incrementally. The stream begins with a WAV header followed by PCM chunks as they are generated.

Example request
curl -X POST https://outeai.com/api/v1/tts_stream \
  -H "Authorization: Bearer oute_xxxx" \
  -H "Content-Type: application/json" \
  -H "Accept: audio/wav" \
  -d '{
    "text": "Streaming audio in real time.",
    "voice_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
    "model_name": "outetts_1_pro"
  }' \
  --output streamed.wav
The first chunk of audio may be delayed on cold GPU starts. See GPU Cold Starts.

Voice Clones

GET /api/v1/voice-clones List your voice clones

Returns all voice clones on your account, plus quota information.

POST /api/v1/voice-clones Create a new voice clone

Accepts either multipart/form-data (file upload) or JSON (base64-encoded audio).

Multipart form (recommended)
FieldTypeNotes
speaker_name required string 2–80 characters. Display name for this voice.
audio required file WAV or MP3, 5–10 seconds, max 10 MB.
curl -X POST https://outeai.com/api/v1/voice-clones \
  -H "Authorization: Bearer oute_xxxx" \
  -F "speaker_name=My Voice" \
  -F "audio=@/path/to/sample.wav"
JSON (base64)
curl -X POST https://outeai.com/api/v1/voice-clones \
  -H "Authorization: Bearer oute_xxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "speaker_name": "My Voice",
    "audio_base64": "<base64-encoded-audio>"
  }'
DELETE /api/v1/voice-clones/:voice_uuid Delete a voice clone

Permanently removes the voice clone. Audio previously generated with this voice is not affected.

API Tokens

GET /api/v1/tokens List API tokens

Returns all active tokens on your account (secrets are masked).

POST /api/v1/tokens Create a new API token

Creates a new token. The full secret is returned only once — store it securely.

DELETE /api/v1/tokens/:token_uuid Revoke a specific token

Immediately revokes the token. Requests using this token will return 401.

You can also manage tokens from your Account page.

Reference

Models

IDDisplay nameSample rateChannels
outetts_1_pro OuteTTS-1-Pro 24 000 Hz 1 (mono)

Additional models may be added in future. Always pass the model ID string in your requests.

Error Codes

All error responses share the same JSON envelope:

{
    "success": False,
    "code": "INSUFFICIENT_CREDITS",
    "message": "You do not have enough credits to start this request.",
}
HTTPCodeMeaning
400VALIDATION_ERRORMissing or invalid request field
401UNAUTHORIZEDMissing or invalid API token
402INSUFFICIENT_CREDITSNot enough credits to generate audio. Top up in your account.
404NOT_FOUNDResource (voice, session) not found
429TTS_RATE_LIMITEDToo many requests per minute. Wait and retry.
502TTS_PROVIDER_UNAVAILABLEUpstream GPU provider is unavailable. Retry after a short wait.

Ready to start?

Top up credits and generate your first audio in minutes.