ElevenLabs speech-to-text with Scribe models and forced alignment via inference.sh CLI. Models: Scribe v1/v2 (98%+ accuracy, 90+ languages). Capabilities: transcription, speaker diarization, audio event tagging, word-level timestamps, forced alignment, subtitle generation. Use f…
Best for: Extracting searchable transcripts from recordings, interviews, or meetings without manual cleanup.
Creator's repository · inference-sh/skills