Each competency with one concrete example in production. No aspirational claims.
01 ·Specification Precision
Production AI systems are only as good as their specs. I write mega-prompts the way I used to write build scripts: every escalation path, every output contract, every anti-pattern documented in-line.
Example
Five production mega-prompts of 230+ lines each, with anti-drift safeguards, explicit escalation tiers, output format contracts. A 13-category meeting classification taxonomy with boundary-detection rules for multi-meeting transcripts. Copilot agent instruction sets with enforced naming conventions, directory organization, and documented anti-patterns.
02 ·Evaluation & Quality Judgment
You don't ship an AI system without an eval. Eval frameworks live alongside the prompts, not in a forgotten notebook.
Example
Structured PowerShell security audit framework covering 6 risk categories: credential exposure, exfiltration, dynamic code execution, Base64 obfuscation, temp file races, network MITM. A 27-point quality checklist for AI-generated visual artifacts with mandatory render-validate-fix loops. README accuracy policies requiring review after every substantive code change.
03 ·Multi-Agent Decomposition & Delegation
One agent doing everything is the slowest, most expensive way to ship. Decompose by capability and cost profile.
Example
22 active Claude Code projects with 42 documented execution plans. Multi-model pipelines: Claude Haiku for high-throughput classification, Claude Sonnet for deep analysis and generation — selected by task complexity and cost profile. Planner-agent workflows with parallel sub-agent dispatching for independent workstreams.
04 ·Failure Pattern Recognition
The 3am-build-broke instinct applied to LLM pipelines: isolate, retry, escalate, never silently swallow.
Example
ADO API resilience framework with progressive retry-backoff (5 attempts, 2–10s delays), per-item isolation to prevent cascade failures, ADO pipeline-native error logging. At Microsoft, automated test-failed-retry pipelines to isolate flaky infrastructure from genuine test failures. Context degradation mitigated through session boundary management and specification reinforcement.
05 ·Trust & Security Design
Agents touch infrastructure now. Permissions need to be scoped by blast radius, not by convenience.
Example
Blast-radius-aware permission systems with 150+ scoped tool permissions governing AI agent access to infrastructure. Human-in-loop approval gates for deployment operations. Azure Key Vault integrated across 8+ automation modules. Security scanning for PowerShell scripts covering exfiltration vectors, backdoor patterns, and credential leaks.
06 ·Context Architecture
Context is the new compute budget. Design for retrieval, not for stuffing.
Example
233-module composable PowerShell library serving as a reusable context layer for AI-assisted development, with a "search-before-create" protocol enforced via global CLAUDE.md directives. Per-project memory persistence files for cross-session state tracking. Dual-file documentation pattern (raw transcript + structured summary) linking meeting insights to Azure DevOps work items for continuous context flow.
07 ·Cost & Token Economics
Production AI has a P&L. Route work to the cheapest sufficient model; budget tokens like CPU time.
Example
Dynamic token allocation modules that adjust API limits based on content length and task complexity. Blended-cost model strategies routing high-volume classification to Haiku (~$0.25/M tokens) and complex analysis to Sonnet — optimizing spend across production pipelines. Token burn forecasts for batch operations across hundreds of transcripts and reports.