TLDW — Too Long; Didn't Watch

An AI content intelligence pipeline that watches 51 YouTube channels so I don't have to. Daily-driven. The video at the top of this site is one of its outputs.

By the numbers

51
YouTube channels tracked
783+
videos processed
3
artifact types per video (transcript, summary, detailed)
daily
cron-driven pipeline

What it does

Every day, a scheduled job polls the YouTube Data API for new uploads across a curated set of strategy/AI/engineering channels. New videos are transcribed (Whisper, GPU-accelerated), classified, and run through two Claude models in sequence — one for a fast TLDW summary, one for a deeper analytical writeup. Output is committed to a git repo so every artifact is versioned, diffable, and immediately queryable by other Claude sessions.

Pipeline

YouTube Data API (polling)
    ↓ new video
Whisper transcription (faster-whisper, CUDA)
    ↓ transcript.md
Claude Haiku — fast summarization, watch recommendation, action items
    ↓ summary.md
Claude Sonnet — detailed analysis, themes, creator perspective, quotes
    ↓ detailed.md
Git commit per artifact  ·  cross-channel search via Claude Code skill

Output structure

For every video, three markdown files land in a per-channel directory:

The directory layout is deliberately filesystem-native (channels/{channel-slug}/{date}_{video-id}/) so any Claude Code session can grep, glob, or summarize across the corpus without hitting a database.

Why it matters (the wedge)

Most "AI summarizer" tools are one-shot consumer apps. TLDW is a production pipeline: it has retry/backoff, state persistence, dual-model cost routing (Haiku for cheap summarization, Sonnet for analysis), and a cross-channel search skill that lets me ask "what did the strategy channels say about agent legibility this month?" without re-reading 50 videos.

This is the same operational discipline that managed 100+ Windows Mobile builds — branch isolation, idempotent steps, observable failures — applied to LLM workflows.

Tech