Every short-form video platform runs on the same physics. The first three seconds either earn the next thirty or they do not. There is no second chance, no slow build, no setup that pays off later because the viewer is already gone. Meta's 2024 internal retention data on a sample of 8.4 billion Reels, summarized in their public Creator Insights report, showed that 71 percent of swipe-aways happen inside the first three seconds and 84 percent inside the first five. TikTok's audience network reports show similar numbers. YouTube Shorts is slightly more forgiving but still front-loaded.

That changes everything about how a short-form video should be constructed. The first three seconds are not an introduction. They are the entire reason the rest of the video gets a chance. Every other production choice is downstream of the hook.

A good hook does three things at once. It tells the viewer what the video is about in one sentence or one visual. It creates a specific tension or curiosity gap that demands resolution. And it earns enough trust that the viewer believes the resolution is worth waiting for. Most creators try to do all three with copy and ignore the visual. That is a mistake. The hook is the headline plus the cover image plus the first beat of action, all running in parallel.

There are five hook patterns that consistently outperform on retention data from 2025 and 2026 testing. The contrarian claim opens with a statement that contradicts what most viewers believe. Something like "rest days are slowing you down" or "tipping is making restaurants worse." The viewer's brain has to keep watching to find out if you can defend it. The list promise opens with a numbered structure the viewer can mentally place. "Three things every founder gets wrong in year one." The brain commits to the structure and waits for the items. The before-after opens with the result the viewer wants. "I went from 16 percent body fat to 9 percent in 14 weeks. Here is what I cut." The viewer is anchored on the destination before the journey starts. The named enemy opens by identifying a specific obstacle. "Your morning routine is breaking your testosterone." The pronoun "your" forces a self-check. The asked question opens with a question the viewer has not been able to articulate. "Why does your coffee taste flat after 10 days?" The brain cannot resist closure on a real question.

What does not work. Vague opens. Slow zooms. Music intros. Logos. Talking head close ups with no movement. Long company introductions. Anything that delays the actual content for more than 1.5 seconds. The watch-time data is unforgiving on this point.

The visual layer matters as much as the words. The first frame should already be in motion or already at peak interest. If the talent is on camera, they should be mid-sentence and mid-gesture, not staring at the lens waiting for the count-in. Cut the count-in in post. Start the file at the first audible word. If the video has a B-roll opener, the B-roll itself should carry tension. A static shot of a gym is dead. A pull-up rep at the top of the bar is alive.

Captions on screen should match the spoken words within 200 milliseconds. The eye reads faster than the ear processes audio in many cases, especially when sound is off, which it is for 85 percent of feed views on Instagram and 67 percent on TikTok according to 2025 platform-published data. Caption font should be high contrast against background. Sans-serif at minimum 80 pixel height for vertical 1080 by 1920 video. Text should be in the upper third or center, not at the bottom where the platform UI covers it.

For Wesley's Lumina podcast clips, the hook test is simple. Open the file. Watch the first three seconds with sound off. If you cannot tell what the video is about and you do not feel pulled to watch the next ten seconds, the hook does not work. Recut. The most common fix is to scrub forward 10 to 30 seconds into the source clip and find a punchier line that can serve as the cold open. The original opening sentence rarely is the best three seconds in the conversation. The best line is usually buried 90 seconds in.

A practical workflow looks like this. Pull the raw clip into Descript, CapCut, or Premiere. Skim the transcript and mark the three or four punchiest lines. Build the cold open from one of those, then cut back to the chronological order of the conversation for the rest of the clip. The viewer enters at peak interest, gets oriented in seconds two through five, and stays through the explanation because they are already invested.

Test 10 hooks per week for a month. Track 3 second retention, 5 second retention, and 30 second retention separately in the platform analytics. The hooks that hold past 30 seconds at above 60 percent are your templates. Build the next 90 days of content around them. Stop guessing. The data is there.