Using the AI to generate subtitles for your videos

How to Use OpenAI and FFmpeg to Generate Subtitles for Any Video

Welcome to an in-depth look behind the scenes on how to use the ChatGPT platform API to create subtitles and how to embed them directly into a video.

This functionality is a core part of Camera and Me, the Manatee, and in this tutorial, I’ll walk you through building this capability yourself. You’ll need to have FFmpeg installed, an account with OpenAI, and a Node.js environment set up.

Core Operations with FFmpeg

FFmpeg is essential for transforming and converting audio from one format to another. In this tutorial, we will first use FFmpeg to extract audio from a video file. Then, after obtaining subtitles from OpenAI, we will use FFmpeg again to embed the subtitles into the video.

Extracting Audio from Video

Assuming you have a video file named myvideo.webm, the first step is to extract the audio. To manage API limitations effectively, compress the audio as much as possible.

ffmpeg -i myvideo.webm -ac 1 -b:a 16k -map a output.webm

FFmpeg parameters explained:

The result is a compressed audio file named output.webm, ready for subtitle generation.

Generating Subtitles with OpenAI

The OpenAI API offers a straightforward endpoint for audio inputs. Although there are libraries for several programming languages, here's an example in Node.js:

  import OpenAI, { toFile } from 'openai';
import fs from 'fs';

const openai = new OpenAI({
  apiKey: process.env['OPENAI_API_KEY'],  // This is the default and can be omitted
});

async function main() {
  const file = await fs.readFileSync("output.webm"); // Read file
  const result = await openai.audio.transcriptions.create({
    model: 'whisper-1',
    response_format: 'vtt',
    file: toFile(file, "output.webm")
  });
  await fs.writeFileSync("subtitles.vtt", result); // Save subtitles
}

main();

This script reads the audio file, sends it to OpenAI, and retrieves the subtitles in VTT format, which are then saved to subtitles.vtt.

Burning Subtitles into the Video

To embed the subtitles into the video, use the FFmpeg subtitles filter:

ffmpeg -i myvideo.webm -vf "subtitles=subtitles.vtt" myvideo-withsubtitles.webm

This process reencodes the video with subtitles embedded, which may take some time.

Taking a Shortcut with Camera and Me, the Manatee

All these steps can be streamlined using Camera and Me, the Manatee. Simply record a new video or upload an existing one to the Manatee cloud. Then click the "create subtitles" icon under the video to generate subtitles automatically. Finally, click the "mp4" icon and select "burn subtitles" to embed them directly into your video.