Both Descript and CapCut can auto-transcribe your course video and generate styled captions in minutes. You upload your video, the AI produces a transcript, you review and correct it, then export the video with captions burned in or as a separate SRT file. The whole process takes less time than manually typing a transcript for a single lesson.
What you’ll walk away with:
- Accurately captioned course videos ready to upload
- SRT files for toggleable subtitles on your course platform
- A custom vocabulary list for your field’s terminology
- Accessible content that works for every student
Why add captions to your course videos
Captions are not optional decoration. They are an accessibility requirement for a meaningful share of your students — anyone who is deaf or hard of hearing, anyone learning in a second language, anyone watching in a noisy environment or a quiet one where they cannot use speakers. Industry surveys consistently find that over 80% of people who use captions are not deaf or hard of hearing — they use them by choice for comprehension and convenience.
The learning impact is measurable. Research on multimedia learning consistently shows that adding text alongside audio improves retention and comprehension by 25-40%, particularly when the content includes unfamiliar vocabulary or complex concepts — which describes most course material. If you teach anything where terminology matters, captions give your students a second channel to process what you are saying.
There is also a practical reality: many of your students watch video without sound. On a commute. During a lunch break at work. While their kids are sleeping. If your course videos have no captions, those students get nothing. Captions turn a silent video into a usable lesson.
Adding captions with Descript
Descript is a video and audio editor built around transcription. When you import a video, it automatically generates a transcript and lets you edit the video by editing the text. That same transcript becomes the foundation for your captions.
Import your video and auto-transcribe
Open Descript, create a new project, and drag your video file in. Descript will automatically transcribe the audio. For a 10-minute video, this usually takes under a minute. The transcript appears in the editor as editable text synchronized to the video timeline.
Review and correct the transcript
Play through the video and read along with the transcript. Fix any words the AI got wrong — technical terms, proper nouns, and acronyms are the most common errors. Descript highlights low-confidence words, which helps you focus your review on the parts most likely to need correction. This is the most important step. Do not skip it.
Style your captions
In Descript's captions panel, choose a visual style for how the text appears on screen. You can adjust font, size, color, background opacity, and position. For course videos, readability matters more than aesthetics — a clean sans-serif font at a generous size with a semi-transparent dark background is usually the right call.
Choose burned-in or SRT export
Descript gives you two options. You can export the video with captions permanently embedded in the image (burned-in), or you can export a separate SRT or VTT subtitle file that platforms can display as a toggleable overlay. For course platforms, the SRT route is usually better — it lets students turn captions on or off.
Export and upload
Export your video (with or without burned-in captions) and your subtitle file. Upload the video to your course platform and attach the SRT file if your platform supports it. If you are uploading to YouTube or Vimeo and embedding in your course, both platforms accept SRT uploads in their subtitle settings.
Adding captions with CapCut
CapCut is a free video editor from ByteDance (the company behind TikTok) that includes built-in auto-captioning. It is simpler than Descript — fewer editing features, but a fast path from video to captioned export.
Import your video and generate auto-captions
Open CapCut's desktop app, create a new project, and import your video. In the text panel, select "Auto captions" and choose your language. CapCut transcribes the audio and places caption text blocks on the timeline, synced to your speech.
Review and correct
Click through the caption blocks and fix errors. CapCut's caption editor shows each text segment alongside its timecode, so you can read sequentially and catch mistakes. As with Descript, pay particular attention to specialized vocabulary — the AI was not trained on your field's jargon.
Choose a style and adjust
CapCut offers preset caption styles ranging from minimal to animated. For course content, pick something clean and readable. You can customize font, size, color, outline, and shadow. Avoid animated word-by-word highlights unless your course specifically benefits from that style — for most educational content, static block captions are easier to read.
Export
CapCut exports video with captions burned in. If you need a separate SRT file, CapCut does not natively export one — you would need to use a third-party tool or switch to Descript for that. For videos where burned-in captions are fine (social clips, standalone lessons), CapCut handles the job cleanly and for free.
The human layer
Auto-transcription gets you 90% of the way, but the last 10% requires your eyes and ears. AI transcription models are trained on general speech — they have no knowledge of your field's terminology, your students' names, the specific frameworks you teach, or the branded language you use in your course.
A yoga instructor teaching about "ujjayi breath" will see it transcribed as "OG eye breath" or "you jai breath." A therapist discussing "EMDR" might get "EM dear." A business coach who mentions "Ruzuku" will almost certainly see it mangled. These errors are not edge cases — they happen in every course video that uses specialized language.
The fix is simple: watch your video with the transcript open and correct what the AI got wrong. Budget about 1.5 times the video length for this review. A 10-minute lesson takes roughly 15 minutes to caption, review, and export. It is not glamorous work, but it is the difference between captions that help your students and captions that confuse them.
Course creator tips
Build a custom vocabulary list
Before you start captioning, write down every specialized term, proper noun, and acronym that appears in your course. Keep this list open while reviewing transcripts. Descript lets you add custom vocabulary to improve future transcriptions — if you teach a multi-lesson course, the AI gets better as you go. CapCut does not have this feature, so you will need to correct the same terms manually each time.
Use a readable font at a generous size
Your students may be watching on phones, tablets, or small laptop screens. Captions that look fine on your 27-inch monitor can be unreadable on a phone. Test your captioned video on the smallest screen your students are likely to use. A minimum of 24-point equivalent with a contrasting background is a good baseline.
Know the burned-in vs. SRT tradeoff
Burned-in captions are permanent — if you find a typo after exporting, you have to re-export the entire video. SRT files can be edited in any text editor without touching the video. If your course content is likely to be updated, or if you want to offer captions in multiple languages later, SRT is the more flexible choice. If you are creating short social clips where universal visibility matters more than editability, burned-in is simpler.
What it gets wrong
Both Descript and CapCut struggle with the same categories of speech. Knowing what to watch for makes your review faster.
- Technical terms and jargon — any word outside common English vocabulary will likely be wrong on first pass. "Polyvagal" becomes "poly vagal" or "polyva goal." "Asana" (the yoga pose) becomes "asana" (the project management tool) or just "a sauna."
- Proper nouns and names — student names, book titles, researcher names, and brand names are frequently garbled. "Vygotsky" becomes "vuh GOT ski." Your own name may not survive intact.
- Acronyms — "ADHD" usually works. "EMDR," "CBT," "IFS," and niche acronyms often do not. The AI may try to spell them as words instead of letters.
- Fast speech and overlapping audio — if you tend to speak quickly or if there is background noise, transcription accuracy drops noticeably. Recording in a quiet space at a moderate pace improves both the AI's accuracy and your students' comprehension.
- Accented English — transcription models still perform better on American and British English than on other accents. If you or your guest speakers have accents that the model handles poorly, plan for extra review time.
Frequently asked questions
Should I burn captions into the video or use a separate SRT file?
It depends on where your video lives. Burned-in captions appear on every platform and device without extra setup, which makes them reliable for social media and downloaded files. SRT files give students the option to toggle captions on or off and let you update text without re-exporting. For course lessons, SRT is usually the better choice. For promotional clips, burned-in is more practical.
How accurate is AI auto-transcription for course videos?
Roughly 90-95% accurate on clear, single-speaker English audio. Accuracy drops with technical jargon, proper nouns, fast speech, and background noise. Always review the transcript before publishing — a 10-minute video typically needs about 3-5 minutes of correction.
Do I need to pay for Descript or CapCut to add captions?
CapCut's desktop app includes auto-captions for free. Descript's free plan gives you one hour of transcription per month. For a full course, Descript's Hobbyist plan at $24/month adds unlimited transcription. If cost is the deciding factor, start with CapCut.
Your captioned videos need a home
Once your videos have clean, accurate captions, the next question is where students will actually watch them. Ruzuku includes built-in video hosting, so you upload your captioned files directly into lessons without managing a separate Vimeo or Wistia account. Your videos live alongside discussion prompts, exercises, and supporting materials — everything in one place.
If you are just getting started, our step-by-step guide walks you through building your first course from scratch, including how to structure video lessons for maximum impact.
Related guides
- How to Edit Course Videos Using Descript's AI Features — full video editing workflow beyond captions
- How to Edit Course Videos with CapCut — CapCut's broader editing tools for course creators
- How to Record Course Videos with Descript — recording and screen capture before editing
- How to Create Your First Online Course — complete guide from idea to launch
- Ruzuku Course Builder — upload videos directly into lessons with built-in hosting