Speech to Text
Transcribe audio with one tool and choose the action that matches the upload length.
Tool Call Format
{
"action": "get_instructions"
}
{
"action": "transcribe_quick",
"file_id": "FILE_ID",
"language_code": "en-US",
"output_format": "text"
}
{
"action": "transcribe_standard",
"public_url": "https://example.com/meeting.m4a",
"output_format": "vtt",
"enable_word_timestamps": true,
"enable_diarization": true
}
{
"action": "transcribe_extended",
"public_url": "https://example.com/interview.webm",
"output_format": "json",
"max_alternatives": 2
}
Actions
transcribe_quick: audio up to 15 minutes. Price: 100 credits.transcribe_standard: audio up to 30 minutes. Price: 150 credits.transcribe_extended: audio up to 60 minutes. Price: 200 credits.
Notes
- Provide either
file_idorpublic_url. public_urlmust be an HTTPS URL and cannot point to private or internal network addresses.- If
language_codeis omitted, the tool defaults toen-US. - Supported output formats:
text,srt,vtt,json. - Optional controls:
enable_diarization,enable_word_timestamps,enable_profanity_filter,max_alternatives. - Subtitle responses include inline subtitle content and may also include stored file links during normal platform invocations.







