Generate subtitles and captions for videos using each::sense AI. Create auto-generated subtitles, multi-language captions, animated TikTok-style text, SRT/VTT exports, speaker diarization, and burned-in subtitles.
---
name: subtitle-generation
description: Generate subtitles and captions for videos using each::sense AI. Create auto-generated subtitles, multi-language captions, animated TikTok-style text, SRT/VTT exports, speaker diarization, and burned-in subtitles.
metadata:
author: eachlabs
version: "2.0"
---
# Subtitle Generation
Generate professional subtitles and captions for videos using each::sense. This skill creates accurate transcriptions, multi-language subtitles, animated captions, and various export formats optimized for social media, video production, and accessibility.
## Features
- **Auto-Generated Subtitles**: Automatic speech-to-text transcription with accurate timing
- **Multi-Language Generation**: Generate subtitles in multiple languages from audio
- **Animated Captions**: TikTok/Instagram-style animated word-by-word captions
- **SRT/VTT Export**: Standard subtitle formats for editing and distribution
- **Speaker Diarization**: Identify and label different speakers in conversations
- **Subtitle Translation**: Translate existing subtitles to other languages
- **Burned-In Subtitles**: Render subtitles directly into video (hardcoded)
- **Karaoke Style**: Word-by-word highlighting for music and lyric videos
- **Timing Adjustment**: Fine-tune subtitle timing and synchronization
- **Batch Processing**: Generate subtitles for multiple videos at once
## Quick Start
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Generate subtitles for this video with accurate timestamps"}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/my-video.mp4"]
}'
```
## Subtitle Formats & Outputs
| Format | Extension | Use Case |
|--------|-----------|----------|
| SRT | .srt | Universal, most video players and editors |
| VTT | .vtt | Web video, HTML5 players, YouTube |
| Burned-In | .mp4 | Social media, no player support needed |
| JSON | .json | Custom applications, programmatic access |
| ASS/SSA | .ass | Advanced styling, anime subtitles |
## Use Case Examples
### 1. Auto-Generate Subtitles from Video
Automatically transcribe speech from a video file with accurate word-level timestamps.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Transcribe this video and generate subtitles with accurate timestamps. Output as SRT format. The video contains English speech."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/interview-video.mp4"]
}'
```
### 2. Multi-Language Subtitle Generation
Generate subtitles in multiple languages directly from the audio.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Generate subtitles for this video in English, Spanish, and French. Provide separate SRT files for each language. The original audio is in English."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/product-demo.mp4"]
}'
```
### 3. Animated/Styled Captions (TikTok Style)
Create eye-catching animated captions with word-by-word highlighting, popular on TikTok and Instagram Reels.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Add TikTok-style animated captions to this video. Use bold white text with black outline, word-by-word pop animation, centered at the bottom third of the screen. Make it trendy and engaging."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/short-form-content.mp4"]
}'
```
### 4. SRT/VTT Export
Generate clean subtitle files in standard formats for use in video editors or streaming platforms.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Transcribe this video and export subtitles in both SRT and VTT formats. Ensure proper line breaks (max 42 characters per line, 2 lines max). Include timestamps accurate to milliseconds."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/documentary.mp4"]
}'
```
### 5. Speaker Diarization (Identify Speakers)
Generate subtitles that identify and label different speakers in conversations, interviews, or podcasts.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Generate subtitles for this podcast with speaker diarization. There are 2 speakers - identify them as Speaker 1 and Speaker 2 (or Host and Guest if you can determine roles). Format each line with the speaker label."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/podcast-episode.mp4"]
}'
```
### 6. Subtitle Translation
Translate existing subtitles from one language to another while preserving timing.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Translate these English subtitles to Japanese. Preserve the original timing and format. Ensure natural Japanese phrasing rather than literal translation."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/original-subtitles.srt"]
}'
```
### 7. Burned-In Subtitles
Render subtitles directly into the video file (hardcoded/embedded) so they appear without needing player support.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Generate subtitles for this video and burn them directly into the video. Use white text with black background box, Arial font, positioned at bottom center. Output a new video file with embedded subtitles."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/social-media-clip.mp4"]
}'
```
### 8. Word-by-Word Karaoke Style
Create karaoke-style subtitles with word-by-word highlighting, perfect for music videos and lyric content.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Create karaoke-style subtitles for this music video. Display lyrics with word-by-word highlighting as they are sung. Use a gradient color change effect (from white to yellow) for the currently sung word. Center the text on screen."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": ["https://example.com/music-video.mp4"]
}'
```
### 9. Subtitle Timing Adjustment
Fine-tune subtitle timing for better synchronization with audio.
```bash
# First, upload video and generate initial subtitles
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Generate subtitles for this video"}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"session_id": "subtitle-timing-project",
"file_urls": ["https://example.com/video-with-delay.mp4"]
}'
# Then adjust timing in the same session
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "The subtitles are appearing 500 milliseconds too early. Shift all subtitle timings forward by 500ms and regenerate the SRT file."}],
"model": "eachsense/beta",
"stream": true,
"session_id": "subtitle-timing-project"
}'
```
### 10. Batch Subtitle Generation
Generate subtitles for multiple videos in a single workflow.
```bash
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Generate English subtitles for all these videos. Output SRT files for each. Use consistent formatting across all videos: max 2 lines, 42 characters per line, minimum 1 second display time per subtitle."}],
"model": "eachsense/beta",
"stream": true,
"mode": "max",
"file_urls": [
"https://example.com/episode-01.mp4",
"https://example.com/episode-02.mp4",
"https://example.com/episode-03.mp4"
]
}'
```
## Best Practices
### Transcription Quality
- **Clear Audio**: Best results with clear speech and minimal background noise
- **Language Hint**: Specify the source language for better accuracy
- **Speaker Count**: Mention number of speakers for better diarization
- **Context**: Provide context about the content (technical terms, names) for accuracy
### Subtitle Formatting
- **Line Length**: Keep lines under 42 characters for readability
- **Duration**: Each subtitle should display for 1-7 seconds
- **Lines Per Subtitle**: Maximum 2 lines per subtitle block
- **Reading Speed**: Target 150-180 words per minute for comfortable reading
### Animated Captions
- **Font Choice**: Bold, sans-serif fonts work best for short-form content
- **Contrast**: Use outlines or shadows for visibility on any background
- **Position**: Keep safe zones clear for platform UI elements
- **Animation**: Subtle animations are more readable than dramatic effects
### Translation
- **Cultural Adaptation**: Request localization, not just translation
- **Timing Flexibility**: Some languages need more time to read
- **Character Limits**: CJK languages often need fewer characters per line
## Prompt Tips for Subtitle Generation
When requesting subtitles, include these details in your prompt:
1. **Source Language**: What language is spoken in the video?
2. **Target Format**: SRT, VTT, burned-in, or animated?
3. **Style Requirements**: Font, color, position, animation
4. **Speaker Info**: Number of speakers, roles if known
5. **Special Terms**: Technical vocabulary, names, brands
6. **Output Languages**: Single language or multiple translations
### Example Prompt Structure
```
"Generate [format] subtitles for this video.
Source language: [language].
[Number] speakers: [roles if known].
Style: [font, color, position requirements].
Special terms to recognize: [names, technical words].
Additional requirements: [line length, timing, etc.]"
```
## Mode Selection
Ask your users before generating:
**"Do you want fast & cheap, or high quality?"**
| Mode | Best For | Speed | Quality |
|------|----------|-------|---------|
| `max` | Final subtitles, professional content, accuracy-critical | Slower | Highest |
| `eco` | Quick drafts, review copies, bulk processing | Faster | Good |
## Multi-Turn Subtitle Refinement
Use `session_id` to iterate on subtitle generation:
```bash
# Initial subtitle generation
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Generate subtitles for this video with speaker identification"}],
"model": "eachsense/beta",
"stream": true,
"session_id": "subtitle-project-001",
"file_urls": ["https://example.com/interview.mp4"]
}'
# Refine based on feedback
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Change Speaker 1 label to John and Speaker 2 to Sarah. Also fix the spelling of TensorFlow wherever it appears."}],
"model": "eachsense/beta",
"stream": true,
"session_id": "subtitle-project-001"
}'
# Add styling and export
curl -X POST https://eachsense-agent.core.eachlabs.run/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-H "Accept: text/event-stream" \
-d '{
"messages": [{"role": "user", "content": "Now create a burned-in version with the corrected subtitles. Use yellow text for John and cyan for Sarah."}],
"model": "eachsense/beta",
"stream": true,
"session_id": "subtitle-project-001"
}'
```
## Language Support
each::sense supports subtitle generation in 50+ languages including:
| Language | Code | Notes |
|----------|------|-------|
| English | en | US, UK, AU variants |
| Spanish | es | Latin American and European |
| French | fr | France and Canadian |
| German | de | |
| Japanese | ja | |
| Korean | ko | |
| Chinese | zh | Simplified and Traditional |
| Arabic | ar | RTL support |
| Hindi | hi | |
| Portuguese | pt | Brazilian and European |
## Error Handling
| Error | Cause | Solution |
|-------|-------|----------|
| `Failed to create prediction: HTTP 422` | Insufficient balance | Top up at eachlabs.ai |
| Transcription quality low | Poor audio quality | Provide cleaner audio source |
| Language detection failed | Mixed languages or unclear speech | Specify source language explicitly |
| Timeout | Long video or complex processing | Set client timeout to minimum 10 minutes |
## Client Configuration
**Important**: Subtitle generation can take significant time for long videos.
- **Minimum timeout**: 10 minutes (600 seconds)
- **Recommended**: Set timeout based on video length (2-3 minutes per minute of video)
- **Streaming**: Use SSE event handling to show progress
## Related Skills
- `each-sense` - Core API documentation
- `video-generation` - Generate videos with built-in captions
- `voice-audio` - Audio processing and speech synthesis
- `video-edit` - Video editing and post-production
Creator's repository · eachlabs/skills