Skip to main content

STT 语音转文本

NUWA 提供统一的语音转文本(STT)接口,兼容 OpenAI 规范,支持实时流式转写与文件转写,适用于通话记录、会议纪要、客服质检等场景。

接口规格

from openai import OpenAI

client = OpenAI(
api_key="sk-***", # NUWA 控制台生成的 Key
base_url="https://api.nuwaapi.com/v1"
)

with open("/path/to/audio.wav", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
)

print(transcript.text)
  • 返回格式:JSON,包含转写文本和可选的分段信息

请求参数(文件转写)

参数类型是否必填说明
filefile音频文件,常见格式:mp3mp4mpegmpgam4awavwebm
modelstring模型 ID,默认 whisper-1
languagestring提示音频语言,例如 zhen,可提升准确度
temperaturenumber采样温度,控制随机性,默认 0
promptstring先验文本/词汇提示,帮助消除歧义
response_formatstring输出格式,默认 json,支持 textsrtverbose_json

调用示例

基础转写

from openai import OpenAI

client = OpenAI(
api_key="sk-***",
base_url="https://api.nuwaapi.com/v1"
)

with open("/path/to/audio.mp3", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
)

print(transcript.text)

指定语言与提示词

from openai import OpenAI

client = OpenAI(
api_key="sk-***",
base_url="https://api.nuwaapi.com/v1"
)

with open("/path/to/meeting.wav", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
language="zh",
prompt="本次会议讨论了 API 集成和结算问题",
)

print(transcript.text)

获取字幕格式(SRT)

from openai import OpenAI

client = OpenAI(
api_key="sk-***",
base_url="https://api.nuwaapi.com/v1"
)

with open("/path/to/video_audio.m4a", "rb") as audio_file:
transcript = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
response_format="srt",
)

print(transcript) # 返回 SRT 字幕字符串

返回示例(简化)

{
"text": "欢迎使用 NUWA 语音转文本接口。"
}