API Documentation

Welcome to the OuteAI API. Our platform provides powerful generative models. This documentation will guide you through integrating our Text-to-Speech (TTS) capabilities into your applications.

To get started, you'll need an API token, which authenticates your requests. You can generate and manage your tokens from your Account page.

Authentication

All API requests must be authenticated using an API token. The token should be passed as a URL query parameter named token.

Requests made without a valid token will fail with a 401 Unauthorized error.

# Example of an authenticated request URL
https://outeai.com/api/v1/audio?token=YOUR_API_TOKEN

API Endpoints

Text-to-Speech

POST/api/v1/audio

This endpoint converts text into spoken audio. It supports various models, voices, languages, and streams the response back.

Request Body

The request body must be a JSON object containing the parameters for audio generation.

Parameter Type Description
modelRequired string The model to use for generation. Currently available: "OuteTTS-1-Pro".
promptRequired string The text to be synthesized into speech. The maximum character length is determined by your subscription plan (e.g., 200 characters for the API plan).
speaker_typeRequired string Specifies the type of voice to use. Must be either "default" for our pre-made voices or "custom" for your own voice clones.
speaker_idRequired string The unique identifier for the voice. For custom speakers, this is the UUID which can be copied from the Voice Cloning Studio. For default speakers, this is the name of the pre-made voice.
temperature float Controls the randomness and expressiveness of the output. Higher values (e.g., 0.7) are more expressive, while lower values (e.g., 0.2) are more deterministic. Must be between 0.1 and 1.0. Defaults to 0.4.

Example Request

Python (Synchronous SDK)
Python (Asynchronous SDK)
cURL
from outeai import OuteAI, APIError

try:
    # It is recommended to set the token as an environment variable
    # Reads from OUTEAI_API_TOKEN automatically
    client = OuteAI()

    # Or initialize the client with the token directly
    # client = OuteAI(token="your_access_token_here")

    print("Generating audio...")
    output = client.generate.audio(
        model="OuteTTS-1-Pro",
        text="Hello, world! This is a test of the text to speech API.",
        speaker_id="EN-FEMALE-1-NEUTRAL",
        temperature=0.4
    )

    # The output object contains the audio data and can be saved directly
    output.save("generated_audio.mp3")

    print(f"Audio generation complete. Duration: {output.duration:.2f}s")

except APIError as e:
    print(f"An API error occurred: {e}")
except Exception as e:
    print(f"An unexpected error occurred: {e}")
finally:
    # It's good practice to close the client
    if 'client' in locals():
        client.close()
import asyncio
from outeai import AsyncOuteAI, APIError

async def main():
    # Using the client as an async context manager handles setup and teardown
    async with AsyncOuteAI() as client:
        try:
            print("Generating audio asynchronously...")
            output = await client.generate.audio(
                model="OuteTTS-1-Pro",
                text="This is an asynchronous test of the API.",
                speaker_id="EN-FEMALE-1-NEUTRAL",
                temperature=0.5
            )

            output.save("generated_audio_async.mp3")
            print(f"Async audio generation complete. Duration: {output.duration:.2f}s")

        except APIError as e:
            print(f"An API error occurred: {e}")
        except Exception as e:
            print(f"An unexpected error occurred: {e}")

if __name__ == "__main__":
    asyncio.run(main())
curl -X POST "https://outeai.com/api/v1/audio?token=YOUR_API_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
       "model": "OuteTTS-1-Pro",
       "prompt": "Hello, world!",
       "speaker_type": "default",
       "speaker_id": "EN-FEMALE-1-NEUTRAL",
       "temperature": 0.4
     }' \
     --output response.ndjson

SDKs

Python SDK

We provide a convenient Python SDK, outeai, to simplify interactions with our API. We highly recommend using it for a seamless development experience.

Installation

pip install outeai

Handling Streams

The Text-to-Speech endpoint returns a streaming response with the content type application/x-ndjson (newline-delimited JSON). This allows you to receive real-time progress updates while the audio is being generated, with the complete audio file delivered in the final message of the stream.

Each line in the stream is a separate JSON object. While generation is in progress, you will receive one or more objects where audio_data is an empty string. These objects can be used to track progress via the audio_token_count field. The very last object in the stream is the only one that will contain the complete audio as a base64-encoded string in the audio_data field.

Streamed Object Structure

{
  "audio_data": "",
  "audio_token_count": 128
}
{
  "audio_data": "",
  "audio_token_count": 256
}
{
  "audio_data": "UklGRi...CwA=",
  "audio_token_count": 256
}

Error Codes

The API uses standard HTTP status codes to indicate the success or failure of a request. A JSON body with an error message may also be returned.

Status Code Code Name Description
400 Bad Request The request was malformed. This can be caused by missing a required parameter, providing a value in the wrong format (e.g., string for temperature), or exceeding a character limit for the prompt.
401 Unauthorized The provided API token is missing, invalid, or expired. Please check your token on the Account page.
402 Insufficient Balance Your account does not have enough credits to perform this request. Please top up your balance or upgrade your plan.
429 Too Many Requests You have exceeded the rate limit for your current plan. Please wait before making new requests.
500 Internal Server Error An unexpected error occurred on our servers. We are automatically notified of these issues. Please try again later.

Default Speaker IDs

When using the "default" speaker type, you can use any of the following pre-made voices. Simply copy the desired Speaker ID and use it in your API request.

Description Speaker ID Action
๐Ÿ‡ฌ๐Ÿ‡ง Olivia (Neutral) EN-FEMALE-1-NEUTRAL
๐Ÿ‡บ๐Ÿ‡ธ James (Neutral) EN-MALE-1-NEUTRAL
๐Ÿ‡ฉ๐Ÿ‡ช Hans (Neutral) DE-MALE-1-NEUTRAL
๐Ÿ‡ซ๐Ÿ‡ท Pierre (Neutral) FR-MALE-1-NEUTRAL
๐Ÿ‡ญ๐Ÿ‡บ Istvรกn (Neutral) HU-MALE-1-NEUTRAL
๐Ÿ‡ฎ๐Ÿ‡น Marco (Neutral) IT-MALE-1-NEUTRAL
๐Ÿ‡ฏ๐Ÿ‡ต Yuki (Neutral) JP-FEMALE-1-NEUTRAL
๐Ÿ‡ฏ๐Ÿ‡ต Hiroshi (Neutral) JP-MALE-1-NEUTRAL
๐Ÿ‡ฐ๐Ÿ‡ท Hana (Neutral) KO-FEMALE-1-NEUTRAL
๐Ÿ‡ฑ๐Ÿ‡ป Jฤnis (Neutral) LV-MALE-1-NEUTRAL
๐Ÿ‡ณ๐Ÿ‡ฑ Anna (Neutral) NL-FEMALE-1-NEUTRAL
๐Ÿ‡ต๐Ÿ‡ฑ Katarzyna (Neutral) PL-FEMALE-1-NEUTRAL
๐Ÿ‡ท๐Ÿ‡บ Dmitri (Neutral) RU-MALE-1-NEUTRAL
๐Ÿ‡จ๐Ÿ‡ณ Jide (Neutral) ZH-FEMALE-1-NEUTRAL