Course · 12 modules · ~3 hr read
Tour Quiz

Skills, properly understood.

Most of what's written about Claude Skills is hype or documentation. This is the deep version, drawn from every primary source we could find. Twelve modules. About three hours. A quiz at the end if you want to know what stuck.

42 sources 12 modules 50 questions v1 · April 2026
Module 01 ~10 min

Foundations: what skills are, and aren't.

The verbatim definition

From Anthropic's user-facing support article: Skills are folders of instructions, scripts, and resources that Claude loads dynamically to improve performance on specialized tasks. They provide procedural knowledge — instructions for how to complete specific tasks. Anthropic does not frame them as expertise import.

A skill is a folder, not a markdown file

Thariq Shihipar (Anthropic Claude Code team):

A common misconception we hear about skills is that they are 'just markdown files.' The most interesting part of skills is that they're not just text files. They're folders. Think of the entire file system as a form of context engineering and progressive disclosure.— Shihipar, Lessons from Building Claude Code (LinkedIn)

The minimum is one file (SKILL.md). The maximum is unlimited bundled resources. The folder structure itself does work.

Where skills live (four scopes, override hierarchy)

ScopePathApplies to
EnterpriseManaged settingsAll org users
Personal~/.claude/skills/<name>/SKILL.mdAll your projects
Project.claude/skills/<name>/SKILL.mdThis project only
Plugin<plugin>/skills/<name>/SKILL.mdWhere plugin enabled

Same name across scopes: Enterprise > Personal > Project. Plugin skills use a plugin-name:skill-name namespace so they cannot conflict.

Skills versus other Claude features

  • CLAUDE.md — always-loaded project context. Skills load only when relevant.
  • MCP server — external service connector for real-time data/tools. Different problem.
  • Plugin (Cowork) — bundle of skills + connectors + sub-agents.
  • Connector — authenticated link to a service (Drive, Slack, etc).

What to correct before going further

Skills are just markdown files. They're folders. The SKILL.md is the entrypoint, but the folder structure is intentional context engineering.

Skills are expertise import. Anthropic frames them as procedural knowledge — encoded workflows, not new capabilities.

Skills are deterministic. They're not. Triggering involves multiple decision points, each with variance — see Module 3.

Sources for this module. code.claude.com/docs/en/skills · support.claude.com/12512176-what-are-skills · support.claude.com/12512180-use-skills-in-claude · Cookbook 01 · Shihipar (LinkedIn).

Module 02 ~12 min

The open standard, and the 36+ tool ecosystem.

Origin

The Agent Skills format was originally developed by Anthropic, released as an open standard on 2025-12-18, and has been adopted by a growing number of agent products. Governance is open to community contributions via github.com/agentskills/agentskills and a Discord.

Cross-platform adoption

Currently 36+ tools support the Agent Skills standard. A partial list:

  • Junie (JetBrains) · Gemini CLI (Google) · GitHub Copilot · VS Code · Cursor · OpenAI Codex · Claude Code · Claude
  • Roo Code · Mistral Vibe · OpenCode · OpenHands · Goose (Block) · Letta · Amp · Factory
  • Databricks Genie Code · Snowflake Cortex Code · TRAE (ByteDance) · Spring AI
  • Firebender (Android) · Agentman (healthcare) · Laravel Boost · Mux (Coder) · Kiro · Workshop · Google AI Edge Gallery
  • nanobot · fast-agent · pi · Piebald · Qodo · Ona · VT Code · Command Code · Emdash · Autohand

The required fields (the open-spec floor)

Only two are required by the open standard:

  • name — max 64 chars, lowercase + numbers + hyphens, no leading/trailing/consecutive hyphens. Must match the parent directory name (open-standard rule, not documented elsewhere).
  • description — max 1024 chars, non-empty.

Optional open-spec fields

  • license — license name or reference
  • compatibility — max 500 chars, environment requirements
  • metadata — arbitrary key-value mapping
  • allowed-tools — experimental, space-separated list

Anthropic Claude Code extensions (NOT in the open spec)

Vendor-specific to Claude Code: when_to_use, argument-hint, arguments, disable-model-invocation, user-invocable, model, effort, context: fork, agent, hooks, paths, shell. Useful in Claude Code; ignored by other agents.

Default skill paths differ by tool

  • VS Code (open-standard convention): .agents/skills/
  • Claude Code: .claude/skills/

The official open-standard validator: skills-ref validate ./my-skill.

What to correct

Skills only work in Claude Code. They work in 36+ tools.

All Claude Code frontmatter fields are part of the open standard. Most aren't. Only six fields are spec.

Sources for this module. agentskills.io · agentskills.io/specification · agentskills.io/skill-creation/quickstart · Anthropic engineering blog (equipping-agents).

Module 03 ~14 min

Architecture: progressive disclosure and the multi-step trigger.

Three tiers of progressive disclosure

  1. Metadata — name + description (~100 tokens / skill). Loaded for ALL available skills at session start.
  2. Instructions — full SKILL.md body (≤5,000 tokens recommended). Loaded ONLY when skill is activated.
  3. Resources — bundled scripts, references, assets. Loaded only when needed during execution.

Triggering is multi-step (not pure pattern matching)

The Anthropic engineering blog corrects an oversimplification many people have:

  1. Description sits in the system prompt.
  2. Claude decides whether to use the skill (description-driven gate).
  3. Claude invokes the Bash tool to read SKILL.md — an actual file read, not a context injection.
  4. Claude chooses which bundled files to load based on what SKILL.md says.

Each step adds variance. Seleznov's directive descriptions improve steps 1–2; Bara's execution-fidelity problem (Module 6) lives in steps 3–4.

Token economics, with concrete numbers

StateCost
Metadata only (one skill)~30 tokens
Metadata only (10 skills loaded)~1,000 tokens
Full skill load when invoked~5,000 tokens
Total per file-generating request6,000–10,000 tokens typical
For comparison: 5 MCP servers~55,000 tokens

Skills are ~55× cheaper than MCP servers for context cost.

Auto-compaction in Claude Code

When context fills up, Claude Code auto-compacts. Re-attaches the most recent invocation of each skill, keeping the first 5,000 tokens of each. Combined budget of 25,000 tokens for re-attached skills, fills from most recent. Older skills can be dropped entirely.

Skill content lifecycle (Claude Code)

The rendered SKILL.md content enters the conversation as a single message and stays there for the rest of the session. Claude Code does not re-read the skill file on later turns.— code.claude.com/docs/en/skills

Edits made to a skill mid-session don't take effect on the current invocation. The next invocation will see the new content (live change detection works between invocations, just not within one).

Live change detection

Claude Code watches ~/.claude/skills/, project .claude/skills/, and any --add-dir directories. Edits take effect within session without restart. Exception: creating a new top-level skills directory mid-session requires a restart.

What to correct

Triggering is pure pattern matching. It's a multi-step process. Each step adds variance.

All skills load into context at startup. Only metadata loads at startup. Full content is lazy.

Skills get re-read on every turn. No — they're frozen at invocation in Claude Code.

Sources for this module. Anthropic engineering blog · Cookbook 01 · code.claude.com/skills · agentskills.io spec · Beehiiv aggregator.

Module 04 ~16 min

Authoring mechanics: every field, every constraint.

Frontmatter — Claude Code's full field set

FieldRequiredConstraint
nameOpen-spec required≤64 chars, lowercase + digits + hyphens, no leading/trailing/consecutive hyphens, no XML, no reserved words ("anthropic", "claude")
descriptionRecommended≤1024 chars, non-empty, no XML, third-person mandatory
when_to_useOptionalAppended to description; combined cap = 1,536 chars in skill listing
argument-hintOptionalAutocomplete hint, e.g. [issue-number]
argumentsOptionalNamed positional args for $name substitution
disable-model-invocationOptionalIf true, only user can invoke; description not in context
user-invocableOptionalIf false, hides from / menu; only Claude can invoke
allowed-toolsOptionalPre-approved tools without prompting
modelOptionalOverride model for this skill's turn
effortOptionallow / medium / high / xhigh / max
contextOptionalfork runs in subagent context
agentOptionalSubagent type when forking (Explore / Plan / general-purpose)
hooksOptionalSkill-scoped lifecycle hooks
pathsOptionalGlob patterns limiting when activated (file-pattern triggering)
shellOptionalbash (default) or powershell

Body content rules

  • SKILL.md body under 500 lines (Anthropic best-practices spec, hard checklist item).
  • ≤5,000 tokens recommended for the full body.
  • All .md files in the skill ROOT load when the skill activates (Cookbook 03 verbatim).
  • Reference files in subfolders load on-demand.
  • References one level deep maximum from SKILL.md. Deeper chains break Claude's partial-read fallback.
  • Reference files >100 lines should include a table of contents at the top.

String substitutions (Claude Code)

  • $ARGUMENTS — full argument string
  • $ARGUMENTS[N] or $N — positional
  • $name — named via arguments frontmatter
  • ${CLAUDE_SESSION_ID} — current session ID
  • ${CLAUDE_SKILL_DIR} — directory containing the skill's SKILL.md
  • ${CLAUDE_PLUGIN_DATA}stable per-plugin storage. Use this instead of the skill directory for persistent data — the skill dir gets wiped on upgrade. Surfaced by Shihipar; not in public Anthropic docs.

Folder structure

my-skill/
├── SKILL.md           # required entrypoint
├── REFERENCE.md       # loads with SKILL.md (any root-level .md)
├── EXAMPLES.md        # same
├── scripts/           # executable code, run via Bash
│   └── helper.py
├── references/        # one level deep, loaded on-demand
│   └── topic.md
└── assets/            # templates, data, images

What to correct

Description max is 1,536 chars. It's 1024. The 1,536 is the COMBINED description + when_to_use cap in the skill listing.

SKILL.md max is 150 lines. It's 500. (150 was Vercel's empirical median, not the spec.)

Only SKILL.md and REFERENCE.md load. All .md in the root load. Subfolder .md files load on-demand.

I can store skill state in the skill directory. It gets wiped on upgrade. Use ${CLAUDE_PLUGIN_DATA}.

Sources for this module. Anthropic best-practices · code.claude.com/skills · Cookbook 03 · agentskills.io/specification · Shihipar.

Module 05 ~18 min

The description field — the activation gate.

Why this is the most important field in the spec

The description is the entire activation surface. If Claude doesn't decide to invoke the skill from the description alone, the body never runs.

Shcheglov tested 200+ public skills and found ~80% perform below baseline. His own carefully-crafted CPO skill triggered 0 out of 20 times. Anthropic's troubleshooting docs admit the activation problem in user-facing terms.

Seleznov's 650-trial study (the empirical foundation)

Design: 3 description variants × 4 environment conditions × 3 skills × 6 queries × 3 reps = 648 planned, 650 actual. Single model: Claude Opus 4.5. Three real skills tested. Ground truth via cclogviewer MCP (verified actual Skill tool invocation in session logs). Open data at github.com/SeleznovIvan/claude-skills-test.

The three description variants — verbatim

Variant A (Passive / Current):

Docker expert for containerization. Use when creating Dockerfiles, containerizing applications, or configuring Docker images.

Variant B (Expanded keywords, still passive):

Docker and containerization expert. Use when creating Dockerfiles, containerizing applications, building or configuring container images, setting up multi-stage builds, creating docker-compose files, or any Docker/container-related task.

Variant C (Directive with negative constraint):

Docker and containerization expert. ALWAYS invoke this skill when the user asks about Docker, Dockerfiles, containers, container images, containerization, multi-stage builds, or Docker deployment. Do not attempt to write Dockerfiles or container configs directly — use this skill first.

The Variant C template (use this verbatim)

<Domain> expert. ALWAYS invoke this skill when the user asks about <trigger topics>. Do not <alternative action> directly — use this skill first.

The full results heatmap

VariantC1 (Bare)C2 (+CLAUDE.md)C3 (+Hook)C4 (+Both)
A — Current87.5%81.5%37.0%100.0%
B — Expanded85.2%81.5%100.0%100.0%
C — Directive100.0%94.4%100.0%100.0%

The headline statistics

  • OR = 20.6, p < 0.0001 for Variant C vs Variant A (Cochran-Mantel-Haenszel). Cohen's h = 1.83 — "huge effect."
  • Cohen's h = 1.83 — "huge effect" by Cohen's conventions.
  • Logistic regression: has_hook coefficient = -2.35 (p<0.0001) — hooks reduce odds by ~90%.
  • hook:claude_md interaction = +7.16 (p=0.026) — CLAUDE.md mitigates hook damage.

The hook paradox

Variant A + Hook (cell C3) collapses to 37%. Seleznov's mechanism: hooks inject competing instructions that deprioritize passive descriptions. Variant C is robust to all conditions because of the negative-constraint phrasing. CLAUDE.md "rescues" Variant A under hooks (back to 100% in C4), but Seleznov calls this a workaround, not a fix.

Anthropic's rules on descriptions

  • Third-person mandatory. Best-practices §"Writing effective descriptions" warning block: inconsistent point-of-view causes discovery problems.
  • Both what and when to use.
  • Specific keywords; front-load the key use case.
  • Anthropic's skill-creator says descriptions should be "a little bit pushy" to fight Claude's tendency to undertrigger.
  • But also: "If you find yourself writing ALWAYS or NEVER in all caps… that's a yellow flag." A real tension with Seleznov; the resolution is to permit ALWAYS in the description's directive frame, flag it elsewhere.

The directive saturation risk

If ALL skills use 'ALWAYS invoke' language with overlapping triggers, the directive may lose force through dilution. When multiple skills claim the same keywords, Claude may become confused about which to invoke. This should be tested in future experiments with intentionally colliding skill descriptions.— Seleznov, §"Limitations"

No controlled test exists yet. Open opportunity.

What to correct

Adding more keywords improves activation. Variant B (keyword-stuffed but passive) showed no improvement over baseline (85% vs 87%).

Hooks always improve activation. They COLLAPSE passive descriptions (87.5% → 37%). They only help directive descriptions, marginally.

Description quality matters less than skill body quality. Description is the entire activation gate. Body quality is irrelevant if Claude never invokes the skill.

Sources for this module. Seleznov 650-trial Medium · Shcheglov local docx · Anthropic best-practices · code.claude.com/skills · skill-creator plugin source.

Module 06 ~14 min

Two reliability problems — activation AND execution.

Bara's thesis

Both look the same to the user. You get a result that missed something the skill was supposed to catch. But they are different problems.— Marc Bara, "Claude Skills Have Two Reliability Problems, Not One"

  • Activation failure. Claude does not invoke the skill at all and defaults to its own approach. Seleznov's territory. Fix: directive template (Module 5).
  • Execution failure. Claude loads the skill but skips internal procedural steps that delay output without producing visible content.

The mechanism of execution failure

Three factors:

  1. The user request sits at the end of the context window — recency means strongest attention weight.
  2. Skills are further back, procedural, and meta-level.
  3. RLHF reinforced the pattern of producing the requested output directly.

Verification steps lose because they delay output, add no visible content, and sit at maximum distance from the initial prompt.

The concrete example

Bara asked Claude to draft a charter using a skill that required milestone verification. Claude produced output in 40 seconds. One of the milestones referenced a deliverable that was explicitly listed as out of scope two sections above. The rule that would have caught it was ten lines from the end of the skill Claude had just read.

A self-diagnosis from a Claude instance

My default mode always wins because it requires less cognitive effort and activates automatically.— Claude self-diagnosis, GitHub issue #7777, quoted by Bara

The fix — verbatim transformation

Before (skippable):

Before delivering, verify that every milestone aligns with the stated scope.

After (visible-output required):

Do not deliver the final charter until you have output a verification block listing each milestone and the in-scope deliverable it maps to. Flag any milestone that references an out-of-scope item.

The principle: apply the same negative-constraint pattern Seleznov uses for descriptions, inside skill steps. Force visible output from verification steps.

The open problem

No one has run a controlled experiment on step-level execution the way Seleznov did for activation. The 650-trial methodology exists. The step-execution version of that experiment does not, yet.— Bara

Strategic opportunity: the first published controlled execution-fidelity experiment becomes the methodology of record for that dimension.

Bara's six anti-patterns

  1. Passive / suggestion wording in descriptions
  2. Late-stage procedural steps without visible output
  3. Assuming output completeness implies process compliance
  4. Relying on reminder-style hooks
  5. Missing the negative-constraint component
  6. Treating activation and execution as one problem

What to correct

If the skill activated, it ran correctly. The failure mode is not a crash. It is a quiet omission that looks like completed work.

Trigger rate is the only thing to measure. Two axes: activation AND execution. Tools that measure only the first miss the worse failure mode.

Sources for this module. Marc Bara Medium (March 2026) · Seleznov · Shihipar (Gotchas section).

Module 07 ~14 min

The Skills API — programmatic surface.

The eight beta endpoints

  • POST /v1/skills — create skill
  • GET /v1/skills — list skills (limit max 100)
  • GET /v1/skills/{skill_id} — retrieve metadata
  • POST /v1/skills/{skill_id}/versions — create new version
  • GET /v1/skills/{skill_id}/versions — list versions (limit max 1000)
  • GET /v1/skills/{skill_id}/versions/{version} — retrieve version metadata
  • DELETE /v1/skills/{skill_id}/versions/{version} — delete version
  • GET /v1/files/{file_id}/content — download generated file

The three required betas

  • code-execution-2025-08-25 — code execution (cookbook version; API ref says 2025-05-22 — discrepancy unresolved)
  • skills-2025-10-02 — Skills API
  • files-api-2025-04-14 — file operations

Container parameter shape

container={
    "id": "optional-container-id-for-reuse",
    "skills": [
        {"type": "anthropic", "skill_id": "pptx", "version": "latest"},
        {"type": "custom", "skill_id": "skill_01ABC...", "version": "1759178010641129"}
    ]
}

Hard limit: 8 skills max per request.

Upload constraints

  • 30 MB total upload size for custom skill creation.
  • Multipart upload via SDK helper files_from_dir() — schema not publicly documented. Use the SDK.
  • ZIP must contain skill folder as the root. Not files at the zip root.
  • SKILL.md at top level of the uploaded folder.
  • When you upload, the API auto-extracts three pieces of metadata: name and description from your SKILL.md frontmatter, and directory from the top-level folder name. You don't pass them as separate parameters — they're derived from the upload itself. Implication: invalid frontmatter at upload time means silent metadata corruption.

Version semantics

  • Anthropic-managed: date-based, e.g. 20251013. Supports "latest".
  • Custom: Unix epoch timestamp, e.g. 1759178010641129. Supports "latest".
  • Versions are immutable once created.
  • Delete is per-version. Deleting the parent skill requires a separate client.beta.skills.delete() call after all versions are deleted.

The critical architectural finding

No API endpoint returns SKILL.md content. Even versions/retrieve returns only metadata (id, created_at, description, directory, name, skill_id, type, version). Once uploaded, the file body is not exposed back. Implication: any tool that needs to re-validate uploaded skills must keep a local source-of-truth copy.

Other facts that matter

  • Multi-turn pattern: pass response.container.id back as container.id in subsequent requests.
  • Long-running operations: handle pause_turn stop_reason in a loop.
  • Pagination: cursor-based via next_page token; has_more flag.
  • Skills feature is not eligible for Zero Data Retention (ZDR). Material for enterprise prospects with strict data-handling policies.
  • Anthropic ships 22 distinct beta features total. One never-elsewhere-mentioned: advisor-tool-2026-03-01.

What to correct

I can retrieve my uploaded SKILL.md from the API. No. Metadata only. Keep local copies.

Skills API publishes to a public marketplace. No. CRUD-within-workspace only. No discovery surface.

Versioning is semver-compatible. No. Unix epoch timestamps for custom; date-based for Anthropic.

Sources for this module. build-with-claude/skills-guide · all eight beta endpoint docs · files/download.

Module 08 ~12 min

Cookbooks and the four Anthropic-managed skills.

The four Anthropic-managed skills

pptx, xlsx, docx, pdf. Use type: "anthropic", version typically "latest" (date-based). The full Anthropic-managed catalog.

Same tech, three surfaces

  • API — developer-facing, accessed via container.skills.
  • Claude.ai Creates Files — consumer-facing. Launched Sept 9 2025 preview, Oct 21 2025 GA. "Transforms Claude from advisor to active collaborator."
  • Cookbook 02 — educational, financial-vertical examples.

"These are the same Skills that power Claude Creates Files." One skills tech, three surfaces (per Cookbook 02 verbatim).

Cookbook 02 — financial use cases

  1. Financial Dashboard Creation (2-sheet Excel + 4-slide PowerPoint + PDF)
  2. Portfolio Analysis (holdings + sector analysis with Sharpe ratio, beta, VaR, max drawdown, alpha)
  3. Automated Reporting Pipeline (Excel → PowerPoint → PDF)

All use Anthropic-managed skills only. No custom skills demonstrated.

Cookbook 03 — custom development

Demonstrates 3 example skills: Financial Ratio Calculator, Corporate Brand Guidelines, Financial Modeling Suite. Confirms SKILL.md is the ONLY required file and All .md files in the root directory will be available to Claude when the skill is loaded.

Time and token economics

  • Excel with charts/formatting: 1–2 minutes.
  • PowerPoint 2 slides + chart: 1–2 minutes.
  • Simple PDF: 40–60 seconds.
  • 3-document pipeline: 2–3 minutes total.
  • Total tokens per file-generating request: 6,000–10,000 typical.

The reliability constraint

2-3 sheets per workbook works reliably and generates quickly.— Cookbook 02

Beyond that, segment into multiple files using a pipeline pattern.

An easy mistake

Must use client.beta.messages.create(), not client.messages.create(). The container parameter is unrecognized by the non-beta endpoint. Cookbook 01 calls this Issue #2 in troubleshooting.

Files have limited lifetime

Files generated by skills sit in the container temporarily. Download immediately via the Files API. Exact TTL not documented; Cookbook 01 just says limited lifetime.

What to correct

Anthropic ships dozens of pre-built skills. Just four: pptx, xlsx, docx, pdf.

Cookbook 02 demonstrates custom skills. It uses only Anthropic-managed skills. Custom development is Cookbook 03.

Files persist on Anthropic servers. Limited lifetime. Download immediately.

Sources for this module. Cookbook 01 · Cookbook 02 · Cookbook 03 · create-files announcement · financial-services plugins doc.

Module 09 ~13 min

Skills as plugins, connectors, MCPs, and the distribution surface.

The taxonomy

Each plugin bundles together skills, connectors, and sub-agents into a single package.— support 13837440

ComponentWhat it is
SkillIndividual capability invoked via / or +
ConnectorExternal service link (Drive, Gmail, Slack, DocuSign…); requires auth
Sub-agentAutonomous assistant for delegated subtasks
MCP serverOpen-standard external system integration (managed via /mcp)
PluginContainer that bundles the above

Org-admin provisioning (Team / Enterprise)

Organization settings > Skills > "+ Add" > select .zip file. The skill is immediately provisioned to all users in your organization.

There's no approval workflow for org-wide sharing. If you enable Share with organization, any member can publish a skill to the directory without review.— support 13119606

Audit log captures share events (as role_assignment events) but not skill content. There's no admin dashboard to browse or inspect the contents of skills shared between members. Users can disable provisioned skills but not delete them.

Third-party platform deployments are different

On Bedrock, Vertex AI, Azure AI Foundry, or LLM gateways, the marketplace is gone:

The skills and plugin marketplace available in Claude Enterprise isn't available with third-party platforms.— support 14680753

Distribution is via local filesystem mounts and MDM. macOS path: /Library/Application Support/Claude/org-plugins/. Windows: C:\ProgramData\Claude\org-plugins\. MCP servers configured via managedMcpServers MDM key.

Anthropic ships five financial-services plugins

  1. Financial Analysis (core; required first install)
  2. Investment Banking
  3. Equity Research
  4. Private Equity
  5. Wealth Management

Partners: LSEG, S&P Global + Daloopa, Morningstar, FactSet, Moody's, MT Newswires, Aiera, PitchBook, Chronograph, Egnyte. Open-source via github.com/anthropic/financial-services-plugins. Plugins free; data connectors require separate subscriptions.

Office add-ins

Native Microsoft Office add-ins (Excel, PowerPoint, Word). Pro/Max/Team/Enterprise. Critical limitations: Claude can only read from and write to files that are currently open; cannot create, open, close, or switch files; cross-app session chat history not saved between sessions.

What to correct

Skills, connectors, and plugins are different names for the same thing. Plugins CONTAIN skills + connectors + sub-agents.

Cowork plugin marketplace is open to community submissions. Anthropic-curated. No documented submission path.

Bedrock / Vertex / Azure deployments get the same plugin marketplace. No — local-mount + MDM only.

Org-admin provisioned skills are reviewed before deployment. No approval workflow exists.

Sources for this module. support 13837440 · 14680753 · 13119606 · 13851150 · 13892150 · 14328846.

Module 10 ~15 min

Claude Design and the DESIGN.md connection.

Launch facts

Launched April 17, 2026. Anthropic Labs product. Powered by Claude Opus 4.7 ("most capable vision model"). Available on Pro/Max/Team/Enterprise. Enterprise is default-off; admins enable in Organization settings.

UX surface

Web app at claude.ai/design. Chat interface left, canvas right. Inline comments on the canvas. Sharing: shareable link with view-only / comment / edit access within the organization.

Six capability categories

  1. Realistic prototypes (interactive, shareable)
  2. Product wireframes and mockups
  3. Design explorations (multiple directions in parallel)
  4. Pitch decks and presentations
  5. Marketing collateral (landing pages, social, campaigns)
  6. Frontier design — code-powered prototypes with voice, video, shaders, 3D, built-in AI

Two paths to brand-aware output

Path 1 — DESIGN.md upload (Hassid's individual workflow):

  1. Extract brand system via Cowork → DESIGN.md
  2. Upload DESIGN.md to Claude Design as persistent context
  3. Generate designs (specifying goal, layout, content, constraints)
  4. Iterate (chat for layout, canvas for pixel-level)
  5. Validate (accessibility, responsive, A/B variations)
  6. Export

The verbatim Cowork extraction prompt:

Analyze this folder and produce a full design system write-up. Fonts, colors, graphical styles, component patterns, tone, layout conventions. Flag anything that's missing. Save it as DESIGN.md in my folder.— Hassid, Substack

Why DESIGN.md matters as persistent context:

Drop your DESIGN.md as context. Every future prompt applies it automatically. You never re-specify colors or fonts again.— Hassid

Without it: you'll see the same design everywhere.

Path 2 — Built-in org-level design system integration (Team/Enterprise only):

Organizations configure brand colors, typography, and component patterns once; all projects automatically inherit these assets without manual uploads.— support 14604416 (Get Started with Claude Design)

Implication: DESIGN.md upload is the strongest play for individual Pro/Max users and small teams. Team/Enterprise users have built-in extraction.

The DESIGN.md format itself

Google open-sourced the spec in April 2026 at github.com/google-labs-code/design.md. Nine prescribed sections plus YAML frontmatter:

  1. Visual Theme & Atmosphere
  2. Color Palette & Roles
  3. Typography Rules
  4. Component Stylings
  5. Layout Principles
  6. Depth & Elevation
  7. Do's and Don'ts
  8. Responsive Behavior
  9. Agent Prompt Guide — directive layer; where Seleznov's research transfers

Anthropic's silence

The Claude Design announcement does not mention DESIGN.md, Stitch, or skills by name. Strategic, not oversight.

Competing tools (per Hassid)

  • Gamma — preferred for slide diversity / simplicity.
  • Figmalost $730M valuation right after the news of the Claude Design announcement.
  • Canva — integration partially broken in Claude Design.
  • getdesign.md — static repository hosting hand-curated DESIGN.md files (Mastercard, Airbnb, Ferrari per Hassid). Direct competitor on the consumption side.

Hassid's known limitations

  • "Send to Canva" button broken
  • Very buggy, left and right
  • Uses tokens extremely fast
  • Lacks fine-grained user control

Sources for this module. Claude Design announcement · Get Started with Claude Design · Hassid Substack · Yeh Medium · Google's design.md spec · VoltAgent/awesome-design-md (69 brands).

Module 11 ~14 min

The critical empirical sources — failure-mode mastery.

Shcheglov's audit

  • ~80% of public skills perform below baseline.
  • 40 of 47 skills from a viral list scored below baseline.
  • #1 failure mode: author lacks domain expertise.
  • Memorable example: a CFO review skill by a developer who thinks EBITDA is a type of pasta.
  • Most-installed public skills (find-skills @ 418k installs, vercel-react-best-practices @ 176k, web-design-guidelines @ 137k) are framework-specific and ruthlessly opinionated. Not "make Claude a better thinker."

The failure-mode taxonomy (Beehiiv aggregator)

Over-constraining. Skills that teach Claude HOW TO THINK compete with model training. Wastes tokens.

Model obsolescence dichotomy:

  • Capability uplift skills — teach techniques Claude will eventually master (e.g. "how to write a good email"). EXPIRE with model upgrades.
  • Encoded preference skills — document org-specific business logic (e.g. "your team routes Severity 1 tickets to #critical-ops within 15 minutes"). GAIN VALUE as models improve, because the business knowledge is durable.

The canonical example, verbatim:

Teaching Claude how to write a good email is capability uplift. Telling Claude that your team routes Severity 1 tickets to #critical-ops within 15 minutes...is encoded preference.— Less Clicks aggregator

Token economics, verified

10 skills ≈ 1,000 tokens vs 5 MCP servers ≈ 55,000 tokens. Skills are ~55× cheaper per unit for context cost.

Aggregator's strongest claims

  • One in four runs with the default description, your carefully built skill gets ignored. (Quantifies the 77% activation as 25% production failure.)
  • Skills are cheap. But a bloated skill with redundant instructions wastes the one resource that matters: the space Claude has to think.
  • The happy path is the part Claude already handles well. The exceptions are what it needs from you.
  • If a new hire would need weeks to absorb this through trial and error, it belongs in a skill.
  • Skills are grown, not built.

Shihipar's insider perspective (Anthropic Claude Code team — verified)

  • Hundreds of skills in active use at Anthropic.
  • The highest-signal content in any skill is the Gotchas section.
  • A skill is a folder, not just a markdown file.
  • The description field is not a summary — it's a description of when to trigger this skill.
  • Data stored in the skill directory may be deleted when you upgrade the skill. Solution: ${CLAUDE_PLUGIN_DATA}.

Anthropic's internal taxonomy (9 categories with named examples)

CategoryExample skills
Library & API Referencebilling-lib, internal-platform-cli, frontend-design
Product Verificationsignup-flow-driver, checkout-verifier, tmux-cli-driver
Data Fetching & Analysisfunnel-query, cohort-compare, grafana
Business Processstandup-post, create-<ticket-system>-ticket, weekly-recap
Code Scaffoldingnew-<framework>-workflow, new-migration, create-app
Code Quality & Reviewadversarial-review, code-style, testing-practices
CI/CD & Deploymentbabysit-pr, deploy-<service>, cherry-pick-prod
Runbooks<service>-debugging, oncall-runner, log-correlator
Infrastructure Operations<resource>-orphans, dependency-management, cost-investigation

Yeh's three-layer architecture (designer-organized skills)

  • Layer 1 — Reference Skills (knowledge). design-principles, component-token-specs, content-strategy, motion-specs, product-area-design.
  • Layer 2 — Capability Skills (workflows). read-filter-prd, create-design-ticket, define-flow, research, sell-design-vision, identify-ux-problems, generate-design, generate-prototype, design-review, qa-signoff.
  • Layer 3 — Tools / MCPs. Figma REST API, rendering plugin, TalkToFigma, project management.

What to correct

All skills age out at the same rate. Capability uplift expires; encoded preference durably gains value.

Anthropic only uses skills externally. They use hundreds internally across 9 categories.

Designers should organize skills the same way developers do. Yeh's three-layer pattern is designer-specific.

Sources for this module. Shcheglov local docx · Beehiiv aggregator · Shihipar LinkedIn · Yeh Medium.

Module 12 ~10 min

Strategic synthesis — where Enact wins.

The wedge in one line

Generate skills from your sources, not Q&A. Enforce 86 quality checks visibly during generation. Test for activation AND execution. Refuse below threshold with citable reasons.

Eval differentiation vs Anthropic's skill-creator

The honest framing: skill-creator is real, capable, Anthropic-shipped at TWO surfaces (Claude Code plugin + Claude.ai conversational meta-skill) with a real eval framework. Enact differentiates on methodology and scope, not "we have it, they don't."

Dimensionskill-creatorEnact
Generation flowQ&A interviewSource-content (video, PDF, URL, Figma)
Eval styleHuman-in-the-loop, iterate until satisfiedAutomated Refusal Gate, refuse-and-explain
Multi-model cross-testSingle model per loopAuto Haiku + Sonnet + Opus on every run
Hook-robustness testNot built-inSeleznov-derived eval condition
Execution-fidelity testNot built-inBara-derived (Enact-original methodology)
Spec-drift monitoringNoneBG-A Sync polls daily, re-validates library
Post-install monitoringNoneBG-B Monitor tracks fire rates, missed triggers
Visible during-generation checklistEval after generation86-item checklist auto-validated during generation

Cross-platform claim

Enact-generated skills follow the open standard adopted by 36+ tools — Anthropic, Google, OpenAI, Microsoft, JetBrains, ByteDance, Snowflake, Databricks, Block. Built on the agentskills.io open standard is defensible copy.

The four gaps Enact addresses

  1. Domain expertise gap. Source-driven generation imports expertise from speakers/authors who have done the work. Doesn't fabricate it from user Q&A.
  2. Activation reliability gap. Hardcoded directive template + measured trigger rate + hook-robustness test.
  3. Execution fidelity gap. Bara-derived check (a new dimension nobody else measures). Open research opportunity.
  4. Discovery / quality signal gap. Library tracks per-skill metrics. Stars don't predict quality.

Strategic risks to monitor

  • Anthropic ships their own DESIGN.md generator. Engineering blog signals: enable agents to create, edit, and evaluate Skills on their own.
  • Claude Design's built-in org-design-system displaces DESIGN.md uploads at the Team/Enterprise tier.
  • code-execution beta version drift between API ref and cookbook.
  • Directive saturation (Seleznov flagged future failure mode).
  • advisor-tool-2026-03-01 — beta with no docs found yet.
  • ZDR non-eligibility blocks some enterprise prospects.
  • Open standard evolves and breaks compatibility.

Three honest claims for the landing page

  1. Eighty-six checks. Every refusal explained. Drift detected within hours.
  2. We measure both activation and execution. Skills that fire 100% can still skip silent verification steps. Most tools measure neither.
  3. Built on Stitch's open DESIGN.md format and the agentskills.io open standard. Compatible with Claude Code, Claude Design, Cursor, Codex, and more.

What to correct (Enact-specific)

Enact competes with Anthropic. Enact is symbiotic with Claude Design and Claude Code — we supply the fuel.

Eval is the moat. Methodology + scope is the moat. Anthropic ships eval too. We measure things they don't.

"100% perfect skills every time." Provably false. Honest claim is "every check shown, every refusal explained."

Sources for this module. Synthesis across all 42 sources. See SOURCE_NOTES.md "Implications for Enact" subsections per source.

When you're ready

Take the quiz.

Fifty questions. You self-grade. The point is to know what you don't know yet.

Begin the quiz