OpenAIで音声ファイルを作成する

2025年11月11日

OpenAIと言えばChatGPTということさへ知らずにGPTは便利に使っていました。
事情があり英会話を音声ファイルにする必要性が出てきたときに、その方法をGPTに問い合わせながらなんとか作成することができました。その方法をメモしておきます。

なんとPythonでそのコードを書いて出力するという力技であるということをお伝えてしておきます。
まずGPTを使用している人であればそのIDとパスワードでOpenAIのページにアクセスします。
そこでAppKeyと呼ばれるOpenAIを使用するためのキーを取得します。そのページでOpenAI使用料を払います。
従量課金制なので最初にいくらか使用料(私は5$)支払ってしまえば、それを使い切るまでは使えるという感じです。
サブスクで知らないうちに課金されるのが心配な方にはよい方法ですね。

あとはPythonで以下のようなコードを書けばOK。これを実行するとmp3ファイルが出来上がります。そのあとでOpenAIのページでOpenAI使用料を見てみると使用した分減っているのが分かります。

from openai import OpenAI
from pydub import AudioSegment

client = OpenAI(api_key="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

VOICE_MALE = "alloy" # 店員・ウェイター・男友達など
VOICE_FEMALE = "verse" # Miki, Emily など

dialogues = {
# --- Q1: Miki と Shop Clerk ---
"q1": ("male", "Hello. Can I help you?"), ("female", "Yes. I want to buy a present for my friend's birthday."), ("male", "How about this red cap?"), ("female", "It looks nice, but my friend doesn't like red. Oh, this blue one is perfect! I'll take it."), ("male", "Great choice! Thank you.") ,

# --- Q2: Ken と Waiter ---
"q2": [
    ("male",   "Welcome. What would you like for lunch, Ken?"),
    ("male",   "We have curry rice, sandwiches, coffee, and orange juice today."),  # 店員の説明をちょっと足す
    ("male",   "They are all good."),
    ("female", "I'm hungry, but I only have six hundred yen. I'll take a sandwich and a coffee, please."),
    # ※もしKenを男性にしたければ上を("male", ...)にしてください
],

# --- Q3: Tom が友だちに明日の予定を話す ---
# 元の内容：日曜の朝テニス → 昼食後宿題 → 夜は祖父母と外食
"q3": [
    ("female", "Hey Tom, what will you do tomorrow?"),
    ("male",   "Tomorrow is Sunday. In the morning, I will play tennis with my father."),
    ("female", "Sounds nice. What will you do after lunch?"),
    ("male",   "After lunch, I will do my homework."),
    ("female", "Will you stay home in the evening?"),
    ("male",   "No. In the evening, I will have dinner with my grandparents at a restaurant.")
],

# --- Q4: Emily が友だちに土曜日の予定を話す ---
# 元の内容：朝図書館→13時ケイトと昼食→午後買い物→夜英語勉強
"q4": [
    ("male",   "Emily, what are you going to do on Saturday?"),
    ("female", "On Saturday, I will go to the library in the morning to return some books."),
    ("male",   "I see. Will you meet someone?"),
    ("female", "Yes. I will meet my friend Kate at one p.m. to have lunch."),
    ("male",   "What will you do after lunch?"),
    ("female", "In the afternoon, we will go shopping together. At night, I will study English at home.")
]
}

def tts_to_file(text: str, voice: str, filename: str):
"""1つのセリフをTTSしてmp3に保存する"""
resp = client.audio.speech.create(
model="gpt-4o-mini-tts",
voice=voice,
input=text,
response_format="mp3",
)
with open(filename, "wb") as f:
f.write(resp.read())

for qname, lines in dialogues.items():
print(f"🔊 generating {qname} …")

segments = []

for i, (speaker, text) in enumerate(lines):
    voice = VOICE_MALE if speaker == "male" else VOICE_FEMALE
    part_file = f"{qname}_part_{i}.mp3"

    # 1セリフずつ生成
    tts_to_file(text, voice, part_file)

    # 読み込んで0.5秒の無音を足す
    seg = AudioSegment.from_mp3(part_file)
    seg = seg + AudioSegment.silent(duration=500)
    segments.append(seg)

# 1回ぶんを連結
one_round = segments[0]
for seg in segments[1:]:
    one_round += seg

# 「1回流す → 1秒あける → もう1回流す」
final_audio = one_round + AudioSegment.silent(duration=1000) + one_round

# mp3として書き出し
out_file = f"{qname}.mp3"
final_audio.export(out_file, format="mp3")
print(f" saved: {out_file}"

よかったらシェアしてね！

URLをコピーしました！

URLをコピーしました！