API Documentation

Welcome to the OuteAI API. Our platform provides powerful generative models. This documentation will guide you through integrating our Text-to-Speech (TTS) capabilities into your applications.

To get started, you'll need an API token, which authenticates your requests. You can generate and manage your tokens from your Account page.

Authentication

All API requests must be authenticated using an API token. The token should be passed as a URL query parameter named token.

Requests made without a valid token will fail with a 401 Unauthorized error.

# Example of an authenticated request URL
https://outeai.com/api/v1/audio?token=YOUR_API_TOKEN

API Endpoints

Text-to-Speech

POST/api/v1/audio

This endpoint converts text into spoken audio. It supports various models, voices, languages, and streams the response back.

Request Body

The request body must be a JSON object containing the parameters for audio generation.

Parameter	Type	Description
`model`Required	string	The model to use for generation. Currently available: `"OuteTTS-1-Pro"`.
`prompt`Required	string	The text to be synthesized into speech. The maximum character length is determined by your subscription plan (e.g., 200 characters for the API plan).
`speaker_type`Required	string	Specifies the type of voice to use. Must be either `"default"` for our pre-made voices or `"custom"` for your own voice clones.
`speaker_id`Required	string	The unique identifier for the voice. For `custom` speakers, this is the UUID which can be copied from the Voice Cloning Studio. For `default` speakers, this is the name of the pre-made voice.
`temperature`	float	Controls the randomness and expressiveness of the output. Higher values (e.g., `0.7`) are more expressive, while lower values (e.g., `0.2`) are more deterministic. Must be between `0.1` and `1.0`. Defaults to `0.4`.

Example Request

Python (Synchronous SDK)

Python (Asynchronous SDK)

cURL

from outeai import OuteAI, APIError

try:
    # It is recommended to set the token as an environment variable
    # Reads from OUTEAI_API_TOKEN automatically
    client = OuteAI()

    # Or initialize the client with the token directly
    # client = OuteAI(token="your_access_token_here")

    print("Generating audio...")
    output = client.generate.audio(
        model="OuteTTS-1-Pro",
        text="Hello, world! This is a test of the text to speech API.",
        speaker_id="EN-FEMALE-1-NEUTRAL",
        temperature=0.4
    )

    # The output object contains the audio data and can be saved directly
    output.save("generated_audio.mp3")

    print(f"Audio generation complete. Duration: {output.duration:.2f}s")

except APIError as e:
    print(f"An API error occurred: {e}")
except Exception as e:
    print(f"An unexpected error occurred: {e}")
finally:
    # It's good practice to close the client
    if 'client' in locals():
        client.close()

import asyncio
from outeai import AsyncOuteAI, APIError

async def main():
    # Using the client as an async context manager handles setup and teardown
    async with AsyncOuteAI() as client:
        try:
            print("Generating audio asynchronously...")
            output = await client.generate.audio(
                model="OuteTTS-1-Pro",
                text="This is an asynchronous test of the API.",
                speaker_id="EN-FEMALE-1-NEUTRAL",
                temperature=0.5
            )

            output.save("generated_audio_async.mp3")
            print(f"Async audio generation complete. Duration: {output.duration:.2f}s")

        except APIError as e:
            print(f"An API error occurred: {e}")
        except Exception as e:
            print(f"An unexpected error occurred: {e}")

if __name__ == "__main__":
    asyncio.run(main())

curl -X POST "https://outeai.com/api/v1/audio?token=YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
       "model": "OuteTTS-1-Pro",
       "prompt": "Hello, world!",
       "speaker_type": "default",
       "speaker_id": "EN-FEMALE-1-NEUTRAL",
       "temperature": 0.4
     }' \
     --output response.ndjson

SDKs

Python SDK

We provide a convenient Python SDK, outeai, to simplify interactions with our API. We highly recommend using it for a seamless development experience.

Installation

pip install outeai

Handling Streams

The Text-to-Speech endpoint returns a streaming response with the content type application/x-ndjson (newline-delimited JSON). This allows you to receive real-time progress updates while the audio is being generated, with the complete audio file delivered in the final message of the stream.

Each line in the stream is a separate JSON object. While generation is in progress, you will receive one or more objects where audio_data is an empty string. These objects can be used to track progress via the audio_token_count field. The very last object in the stream is the only one that will contain the complete audio as a base64-encoded string in the audio_data field.

Streamed Object Structure

{
  "audio_data": "",
  "audio_token_count": 128
}
{
  "audio_data": "",
  "audio_token_count": 256
}
{
  "audio_data": "UklGRi...CwA=",
  "audio_token_count": 256
}

Error Codes

The API uses standard HTTP status codes to indicate the success or failure of a request. A JSON body with an error message may also be returned.

Status Code	Code Name	Description
`400`	Bad Request	The request was malformed. This can be caused by missing a required parameter, providing a value in the wrong format (e.g., string for temperature), or exceeding a character limit for the prompt.
`401`	Unauthorized	The provided API `token` is missing, invalid, or expired. Please check your token on the Account page.
`402`	Insufficient Balance	Your account does not have enough credits to perform this request. Please top up your balance or upgrade your plan.
`429`	Too Many Requests	You have exceeded the rate limit for your current plan. Please wait before making new requests.
`500`	Internal Server Error	An unexpected error occurred on our servers. We are automatically notified of these issues. Please try again later.

Default Speaker IDs

When using the "default" speaker type, you can use any of the following pre-made voices. Simply copy the desired Speaker ID and use it in your API request.

Description	Speaker ID	Action
🇬🇧 Olivia (Neutral)	`EN-FEMALE-1-NEUTRAL`
🇺🇸 James (Neutral)	`EN-MALE-1-NEUTRAL`
🇩🇪 Hans (Neutral)	`DE-MALE-1-NEUTRAL`
🇫🇷 Pierre (Neutral)	`FR-MALE-1-NEUTRAL`
🇭🇺 István (Neutral)	`HU-MALE-1-NEUTRAL`
🇮🇹 Marco (Neutral)	`IT-MALE-1-NEUTRAL`
🇯🇵 Yuki (Neutral)	`JP-FEMALE-1-NEUTRAL`
🇯🇵 Hiroshi (Neutral)	`JP-MALE-1-NEUTRAL`
🇰🇷 Hana (Neutral)	`KO-FEMALE-1-NEUTRAL`
🇱🇻 Jānis (Neutral)	`LV-MALE-1-NEUTRAL`
🇳🇱 Anna (Neutral)	`NL-FEMALE-1-NEUTRAL`
🇵🇱 Katarzyna (Neutral)	`PL-FEMALE-1-NEUTRAL`
🇷🇺 Dmitri (Neutral)	`RU-MALE-1-NEUTRAL`
🇨🇳 Jide (Neutral)	`ZH-FEMALE-1-NEUTRAL`