{"openapi":"3.1.0","info":{"title":"x402-voice","version":"0.1.0","summary":"Give Your Agent A Voice with pay-per-call x402 speech.","description":"Give your AI agent a voice. x402-voice turns agent text into WAV audio with per-call USDC payments on Base.\n\nStarts at $0.001 for short base-voice speech and $0.005 for Custom persona voices, with no API keys, subscriptions, dashboards, or monthly commitments. Built for autonomous agents that need to speak inside Discord, Telegram, browsers, games, monitoring workflows, support flows, and content pipelines.\n\nUse the OpenAI-shaped `/v1/audio/speech` endpoint when you want a familiar request format for the 10 base voices, or use `/v1/tts/custom` for Custom-only persona voices like Storyteller, Velvet, Narrator, Announcer, Assistant, and Urgent. Pro adds speed and quality controls for base voices. Custom unlocks persona voices plus speed and quality controls.\n\nExpression controls: `speed` is available on Pro and Custom routes from 0.7-2.0 with presets slow=0.7, normal=1.0, fast=1.3, rapid=1.6. `steps` controls quality from 1-100 with presets draft=4, standard=8, high=16, ultra=24. Persona voices are prebuilt expressive blends and require Custom tier.\n\nA micropayment-priced alternative to API-key TTS services like OpenAI TTS, ElevenLabs, Cartesia, and PlayHT.","contact":{"email":"clawdbotworker@gmail.com"},"x-guidance":"Agents should call a paid endpoint without payment first to receive the x402 challenge, then retry with a valid payment header. Use /v1/audio/speech for OpenAI-shaped requests with the 10 base voices, /v1/tts/base for lowest-cost base voices, /v1/tts/pro for speed/quality controls on base voices, /v1/tts/custom for persona voices, and /v1/tts/batch for up to 20 base-voice texts. Use short endpoints for 1-500 characters and -long endpoints for 501-2000 characters. Base, Pro, Batch, and OpenAI-shaped routes do not accept persona voices; persona voices require Custom tier.","x-voice-tier-rules":{"base_voices":["M1","M2","M3","M4","M5","F1","F2","F3","F4","F5"],"standard_voices":["M1","M2","M3","M4","M5","F1","F2","F3","F4","F5"],"persona_voices":["Storyteller","Narrator","Announcer","Assistant","Urgent","Sage","Spark","Anchor","Velvet","Echo"],"persona_endpoint":"/v1/tts/custom","openai_endpoint_standard_voices_only":true},"x-expressive-controls":{"speed":{"tier":"pro_or_custom","parameter":"speed","range":[0.7,2],"default":1.05,"presets":{"slow":0.7,"normal":1,"fast":1.3,"rapid":1.6}},"quality":{"tier":"pro_or_custom","parameter":"steps","range":[1,100],"default":8,"presets":{"draft":4,"standard":8,"high":16,"ultra":24}},"persona_voices":{"tier":"custom","endpoint":"/v1/tts/custom","note":"Persona voices are prebuilt expressive blends. Use the persona voice name in the voice field.","recipes":{"Storyteller":"warm measured narration, M1+F3 blend, slower pace","Narrator":"neutral documentary narration, M3+F3 blend","Announcer":"crisp authoritative announcement voice, M2+M5 blend","Assistant":"friendly clear assistant voice, F1+F4 blend","Urgent":"sharp alert voice, M4+F2 blend, faster pace","Sage":"deep calm wellness voice","Spark":"energetic youthful voice","Anchor":"authoritative report voice","Velvet":"warm rich premium voice","Echo":"androgynous neutral accessibility voice"}},"route_rules":{"base":"10 base voices only (M1-M5, F1-F5); no speed, steps, or personas","pro":"10 base voices only (M1-M5, F1-F5) with speed and steps; no personas","custom":"all 20 voices: base voices plus Custom-only persona voices with speed and steps","openai":"OpenAI-shaped request for the 10 base voices only; no personas","batch":"10 base voices only (M1-M5, F1-F5); no speed, steps, or personas"}}},"servers":[{"url":"https://voice.forgemesh.io"}],"paths":{"/v1/tts/base":{"post":{"operationId":"generateStandardVoice","summary":"Generate Standard Voice ≤500 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.001"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["text"],"properties":{"text":{"type":"string","description":"Text to synthesize"},"voice":{"type":"string","description":"Standard voice M1-M5/F1-F5, or persona voice on Custom tier"},"lang":{"type":"string","description":"Language code; 31 languages supported"},"speed":{"type":"number","description":"Pro/Custom expressive control, 0.7-2.0. Presets: slow 0.7, normal 1.0, fast 1.3, rapid 1.6"},"steps":{"type":"integer","description":"Pro/Custom quality control, 1-100. Presets: draft 4, standard 8, high 16, ultra 24"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/tts/base-long":{"post":{"operationId":"generateStandardVoiceLong","summary":"Generate Standard Voice 501-2000 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.003"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["text"],"properties":{"text":{"type":"string","description":"Text to synthesize"},"voice":{"type":"string","description":"Standard voice M1-M5/F1-F5, or persona voice on Custom tier"},"lang":{"type":"string","description":"Language code; 31 languages supported"},"speed":{"type":"number","description":"Pro/Custom expressive control, 0.7-2.0. Presets: slow 0.7, normal 1.0, fast 1.3, rapid 1.6"},"steps":{"type":"integer","description":"Pro/Custom quality control, 1-100. Presets: draft 4, standard 8, high 16, ultra 24"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/tts/pro":{"post":{"operationId":"generateControlledVoice","summary":"Generate Controlled Voice ≤500 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.003"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["text"],"properties":{"text":{"type":"string","description":"Text to synthesize"},"voice":{"type":"string","description":"Standard voice M1-M5/F1-F5, or persona voice on Custom tier"},"lang":{"type":"string","description":"Language code; 31 languages supported"},"speed":{"type":"number","description":"Pro/Custom expressive control, 0.7-2.0. Presets: slow 0.7, normal 1.0, fast 1.3, rapid 1.6"},"steps":{"type":"integer","description":"Pro/Custom quality control, 1-100. Presets: draft 4, standard 8, high 16, ultra 24"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/tts/pro-long":{"post":{"operationId":"generateControlledVoiceLong","summary":"Generate Controlled Voice 501-2000 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.006"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["text"],"properties":{"text":{"type":"string","description":"Text to synthesize"},"voice":{"type":"string","description":"Standard voice M1-M5/F1-F5, or persona voice on Custom tier"},"lang":{"type":"string","description":"Language code; 31 languages supported"},"speed":{"type":"number","description":"Pro/Custom expressive control, 0.7-2.0. Presets: slow 0.7, normal 1.0, fast 1.3, rapid 1.6"},"steps":{"type":"integer","description":"Pro/Custom quality control, 1-100. Presets: draft 4, standard 8, high 16, ultra 24"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/tts/custom":{"post":{"operationId":"generatePersonaVoice","summary":"Generate Persona Voice ≤500 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.005"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["text"],"properties":{"text":{"type":"string","description":"Text to synthesize"},"voice":{"type":"string","description":"Standard voice M1-M5/F1-F5, or persona voice on Custom tier"},"lang":{"type":"string","description":"Language code; 31 languages supported"},"speed":{"type":"number","description":"Pro/Custom expressive control, 0.7-2.0. Presets: slow 0.7, normal 1.0, fast 1.3, rapid 1.6"},"steps":{"type":"integer","description":"Pro/Custom quality control, 1-100. Presets: draft 4, standard 8, high 16, ultra 24"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/tts/custom-long":{"post":{"operationId":"generatePersonaVoiceLong","summary":"Generate Persona Voice 501-2000 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.01"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["text"],"properties":{"text":{"type":"string","description":"Text to synthesize"},"voice":{"type":"string","description":"Standard voice M1-M5/F1-F5, or persona voice on Custom tier"},"lang":{"type":"string","description":"Language code; 31 languages supported"},"speed":{"type":"number","description":"Pro/Custom expressive control, 0.7-2.0. Presets: slow 0.7, normal 1.0, fast 1.3, rapid 1.6"},"steps":{"type":"integer","description":"Pro/Custom quality control, 1-100. Presets: draft 4, standard 8, high 16, ultra 24"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/audio/speech":{"post":{"operationId":"generateOpenAiCompatibleVoice","summary":"Generate OpenAI-Compatible Voice ≤500 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.001"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["input"],"properties":{"input":{"type":"string"},"voice":{"type":"string"},"model":{"type":"string"},"response_format":{"type":"string"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/audio/speech-long":{"post":{"operationId":"generateOpenAiCompatibleVoiceLong","summary":"Generate OpenAI-Compatible Voice 501-2000 chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.003"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["input"],"properties":{"input":{"type":"string"},"voice":{"type":"string"},"model":{"type":"string"},"response_format":{"type":"string"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"audio/wav":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/tts/batch":{"post":{"operationId":"generateBatchVoices","summary":"Generate Batch Voices ≤500 total chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.002"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["items"],"properties":{"items":{"type":"array"},"defaults":{"type":"object"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"application/json":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/v1/tts/batch-long":{"post":{"operationId":"generateBatchVoicesLong","summary":"Generate Batch Voices 501-2000 total chars","x-payment-info":{"price":{"mode":"fixed","currency":"USD","amount":"0.005"},"protocols":[{"x402":{}}]},"requestBody":{"required":true,"content":{"application/json":{"schema":{"type":"object","required":["items"],"properties":{"items":{"type":"array"},"defaults":{"type":"object"}}}}}},"responses":{"200":{"description":"Successful synthesis response","content":{"application/json":{}}},"400":{"description":"Invalid request or wrong pricing bucket"},"402":{"description":"Payment Required"}}}},"/health":{"get":{"operationId":"health","summary":"Health check","security":[],"responses":{"200":{"description":"OK"}}}},"/metrics":{"get":{"operationId":"metrics","summary":"Queue/cache/saturation metrics","security":[],"responses":{"200":{"description":"OK"}}}},"/v1/voices":{"get":{"operationId":"listVoices","summary":"Voice catalog","security":[],"responses":{"200":{"description":"OK"}}}},"/.well-known/x402.json":{"get":{"operationId":"x402Discovery","summary":"x402 discovery","security":[],"responses":{"200":{"description":"OK"}}}},"/openapi.json":{"get":{"operationId":"openapi","summary":"This spec","security":[],"responses":{"200":{"description":"OK"}}}},"/llms.txt":{"get":{"operationId":"llmsTxt","summary":"LLM discoverability","security":[],"responses":{"200":{"description":"OK"}}}}}}