Skills, properly understood.
Most of what's written about Claude Skills is hype or documentation. This is the deep version, drawn from every primary source we could find. Twelve modules. About three hours. A quiz at the end if you want to know what stuck.
Foundations: what skills are, and aren't.
The verbatim definition
From Anthropic's user-facing support article: Skills are folders of instructions, scripts, and resources that Claude loads dynamically to improve performance on specialized tasks. They provide procedural knowledge — instructions for how to complete specific tasks. Anthropic does not frame them as expertise import.
A skill is a folder, not a markdown file
Thariq Shihipar (Anthropic Claude Code team):
A common misconception we hear about skills is that they are 'just markdown files.' The most interesting part of skills is that they're not just text files. They're folders. Think of the entire file system as a form of context engineering and progressive disclosure.— Shihipar, Lessons from Building Claude Code (LinkedIn)
The minimum is one file (SKILL.md). The maximum is unlimited bundled resources. The folder structure itself does work.
Where skills live (four scopes, override hierarchy)
| Scope | Path | Applies to |
|---|---|---|
| Enterprise | Managed settings | All org users |
| Personal | ~/.claude/skills/<name>/SKILL.md | All your projects |
| Project | .claude/skills/<name>/SKILL.md | This project only |
| Plugin | <plugin>/skills/<name>/SKILL.md | Where plugin enabled |
Same name across scopes: Enterprise > Personal > Project. Plugin skills use a plugin-name:skill-name namespace so they cannot conflict.
Skills versus other Claude features
- CLAUDE.md — always-loaded project context. Skills load only when relevant.
- MCP server — external service connector for real-time data/tools. Different problem.
- Plugin (Cowork) — bundle of skills + connectors + sub-agents.
- Connector — authenticated link to a service (Drive, Slack, etc).
What to correct before going further
Skills are just markdown files. They're folders. The SKILL.md is the entrypoint, but the folder structure is intentional context engineering.
Skills are expertise import. Anthropic frames them as procedural knowledge — encoded workflows, not new capabilities.
Skills are deterministic. They're not. Triggering involves multiple decision points, each with variance — see Module 3.
Sources for this module. code.claude.com/docs/en/skills · support.claude.com/12512176-what-are-skills · support.claude.com/12512180-use-skills-in-claude · Cookbook 01 · Shihipar (LinkedIn).
The open standard, and the 36+ tool ecosystem.
Origin
The Agent Skills format was originally developed by Anthropic, released as an open standard on 2025-12-18, and has been adopted by a growing number of agent products. Governance is open to community contributions via github.com/agentskills/agentskills and a Discord.
Cross-platform adoption
Currently 36+ tools support the Agent Skills standard. A partial list:
- Junie (JetBrains) · Gemini CLI (Google) · GitHub Copilot · VS Code · Cursor · OpenAI Codex · Claude Code · Claude
- Roo Code · Mistral Vibe · OpenCode · OpenHands · Goose (Block) · Letta · Amp · Factory
- Databricks Genie Code · Snowflake Cortex Code · TRAE (ByteDance) · Spring AI
- Firebender (Android) · Agentman (healthcare) · Laravel Boost · Mux (Coder) · Kiro · Workshop · Google AI Edge Gallery
- nanobot · fast-agent · pi · Piebald · Qodo · Ona · VT Code · Command Code · Emdash · Autohand
The required fields (the open-spec floor)
Only two are required by the open standard:
name— max 64 chars, lowercase + numbers + hyphens, no leading/trailing/consecutive hyphens. Must match the parent directory name (open-standard rule, not documented elsewhere).description— max 1024 chars, non-empty.
Optional open-spec fields
license— license name or referencecompatibility— max 500 chars, environment requirementsmetadata— arbitrary key-value mappingallowed-tools— experimental, space-separated list
Anthropic Claude Code extensions (NOT in the open spec)
Vendor-specific to Claude Code: when_to_use, argument-hint, arguments, disable-model-invocation, user-invocable, model, effort, context: fork, agent, hooks, paths, shell. Useful in Claude Code; ignored by other agents.
Default skill paths differ by tool
- VS Code (open-standard convention):
.agents/skills/ - Claude Code:
.claude/skills/
The official open-standard validator: skills-ref validate ./my-skill.
What to correct
Skills only work in Claude Code. They work in 36+ tools.
All Claude Code frontmatter fields are part of the open standard. Most aren't. Only six fields are spec.
Sources for this module. agentskills.io · agentskills.io/specification · agentskills.io/skill-creation/quickstart · Anthropic engineering blog (equipping-agents).
Architecture: progressive disclosure and the multi-step trigger.
Three tiers of progressive disclosure
- Metadata — name + description (~100 tokens / skill). Loaded for ALL available skills at session start.
- Instructions — full SKILL.md body (≤5,000 tokens recommended). Loaded ONLY when skill is activated.
- Resources — bundled scripts, references, assets. Loaded only when needed during execution.
Triggering is multi-step (not pure pattern matching)
The Anthropic engineering blog corrects an oversimplification many people have:
- Description sits in the system prompt.
- Claude decides whether to use the skill (description-driven gate).
- Claude invokes the Bash tool to read SKILL.md — an actual file read, not a context injection.
- Claude chooses which bundled files to load based on what SKILL.md says.
Each step adds variance. Seleznov's directive descriptions improve steps 1–2; Bara's execution-fidelity problem (Module 6) lives in steps 3–4.
Token economics, with concrete numbers
| State | Cost |
|---|---|
| Metadata only (one skill) | ~30 tokens |
| Metadata only (10 skills loaded) | ~1,000 tokens |
| Full skill load when invoked | ~5,000 tokens |
| Total per file-generating request | 6,000–10,000 tokens typical |
| For comparison: 5 MCP servers | ~55,000 tokens |
Skills are ~55× cheaper than MCP servers for context cost.
Auto-compaction in Claude Code
When context fills up, Claude Code auto-compacts. Re-attaches the most recent invocation of each skill, keeping the first 5,000 tokens of each. Combined budget of 25,000 tokens for re-attached skills, fills from most recent. Older skills can be dropped entirely.
Skill content lifecycle (Claude Code)
The rendered SKILL.md content enters the conversation as a single message and stays there for the rest of the session. Claude Code does not re-read the skill file on later turns.— code.claude.com/docs/en/skills
Edits made to a skill mid-session don't take effect on the current invocation. The next invocation will see the new content (live change detection works between invocations, just not within one).
Live change detection
Claude Code watches ~/.claude/skills/, project .claude/skills/, and any --add-dir directories. Edits take effect within session without restart. Exception: creating a new top-level skills directory mid-session requires a restart.
What to correct
Triggering is pure pattern matching. It's a multi-step process. Each step adds variance.
All skills load into context at startup. Only metadata loads at startup. Full content is lazy.
Skills get re-read on every turn. No — they're frozen at invocation in Claude Code.
Sources for this module. Anthropic engineering blog · Cookbook 01 · code.claude.com/skills · agentskills.io spec · Beehiiv aggregator.
Authoring mechanics: every field, every constraint.
Frontmatter — Claude Code's full field set
| Field | Required | Constraint |
|---|---|---|
name | Open-spec required | ≤64 chars, lowercase + digits + hyphens, no leading/trailing/consecutive hyphens, no XML, no reserved words ("anthropic", "claude") |
description | Recommended | ≤1024 chars, non-empty, no XML, third-person mandatory |
when_to_use | Optional | Appended to description; combined cap = 1,536 chars in skill listing |
argument-hint | Optional | Autocomplete hint, e.g. [issue-number] |
arguments | Optional | Named positional args for $name substitution |
disable-model-invocation | Optional | If true, only user can invoke; description not in context |
user-invocable | Optional | If false, hides from / menu; only Claude can invoke |
allowed-tools | Optional | Pre-approved tools without prompting |
model | Optional | Override model for this skill's turn |
effort | Optional | low / medium / high / xhigh / max |
context | Optional | fork runs in subagent context |
agent | Optional | Subagent type when forking (Explore / Plan / general-purpose) |
hooks | Optional | Skill-scoped lifecycle hooks |
paths | Optional | Glob patterns limiting when activated (file-pattern triggering) |
shell | Optional | bash (default) or powershell |
Body content rules
- SKILL.md body under 500 lines (Anthropic best-practices spec, hard checklist item).
- ≤5,000 tokens recommended for the full body.
- All
.mdfiles in the skill ROOT load when the skill activates (Cookbook 03 verbatim). - Reference files in subfolders load on-demand.
- References one level deep maximum from SKILL.md. Deeper chains break Claude's partial-read fallback.
- Reference files >100 lines should include a table of contents at the top.
String substitutions (Claude Code)
$ARGUMENTS— full argument string$ARGUMENTS[N]or$N— positional$name— named viaargumentsfrontmatter${CLAUDE_SESSION_ID}— current session ID${CLAUDE_SKILL_DIR}— directory containing the skill's SKILL.md${CLAUDE_PLUGIN_DATA}— stable per-plugin storage. Use this instead of the skill directory for persistent data — the skill dir gets wiped on upgrade. Surfaced by Shihipar; not in public Anthropic docs.
Folder structure
my-skill/
├── SKILL.md # required entrypoint
├── REFERENCE.md # loads with SKILL.md (any root-level .md)
├── EXAMPLES.md # same
├── scripts/ # executable code, run via Bash
│ └── helper.py
├── references/ # one level deep, loaded on-demand
│ └── topic.md
└── assets/ # templates, data, images
What to correct
Description max is 1,536 chars. It's 1024. The 1,536 is the COMBINED description + when_to_use cap in the skill listing.
SKILL.md max is 150 lines. It's 500. (150 was Vercel's empirical median, not the spec.)
Only SKILL.md and REFERENCE.md load. All .md in the root load. Subfolder .md files load on-demand.
I can store skill state in the skill directory. It gets wiped on upgrade. Use ${CLAUDE_PLUGIN_DATA}.
Sources for this module. Anthropic best-practices · code.claude.com/skills · Cookbook 03 · agentskills.io/specification · Shihipar.
The description field — the activation gate.
Why this is the most important field in the spec
The description is the entire activation surface. If Claude doesn't decide to invoke the skill from the description alone, the body never runs.
Shcheglov tested 200+ public skills and found ~80% perform below baseline. His own carefully-crafted CPO skill triggered 0 out of 20 times. Anthropic's troubleshooting docs admit the activation problem in user-facing terms.
Seleznov's 650-trial study (the empirical foundation)
Design: 3 description variants × 4 environment conditions × 3 skills × 6 queries × 3 reps = 648 planned, 650 actual. Single model: Claude Opus 4.5. Three real skills tested. Ground truth via cclogviewer MCP (verified actual Skill tool invocation in session logs). Open data at github.com/SeleznovIvan/claude-skills-test.
The three description variants — verbatim
Variant A (Passive / Current):
Docker expert for containerization. Use when creating Dockerfiles, containerizing applications, or configuring Docker images.
Variant B (Expanded keywords, still passive):
Docker and containerization expert. Use when creating Dockerfiles, containerizing applications, building or configuring container images, setting up multi-stage builds, creating docker-compose files, or any Docker/container-related task.
Variant C (Directive with negative constraint):
Docker and containerization expert. ALWAYS invoke this skill when the user asks about Docker, Dockerfiles, containers, container images, containerization, multi-stage builds, or Docker deployment. Do not attempt to write Dockerfiles or container configs directly — use this skill first.
The Variant C template (use this verbatim)
<Domain> expert. ALWAYS invoke this skill when the user asks about <trigger topics>. Do not <alternative action> directly — use this skill first.
The full results heatmap
| Variant | C1 (Bare) | C2 (+CLAUDE.md) | C3 (+Hook) | C4 (+Both) |
|---|---|---|---|---|
| A — Current | 87.5% | 81.5% | 37.0% | 100.0% |
| B — Expanded | 85.2% | 81.5% | 100.0% | 100.0% |
| C — Directive | 100.0% | 94.4% | 100.0% | 100.0% |
The headline statistics
- OR = 20.6, p < 0.0001 for Variant C vs Variant A (Cochran-Mantel-Haenszel). Cohen's h = 1.83 — "huge effect."
- Cohen's h = 1.83 — "huge effect" by Cohen's conventions.
- Logistic regression:
has_hookcoefficient = -2.35 (p<0.0001) — hooks reduce odds by ~90%. hook:claude_mdinteraction = +7.16 (p=0.026) — CLAUDE.md mitigates hook damage.
The hook paradox
Variant A + Hook (cell C3) collapses to 37%. Seleznov's mechanism: hooks inject competing instructions that deprioritize passive descriptions. Variant C is robust to all conditions because of the negative-constraint phrasing. CLAUDE.md "rescues" Variant A under hooks (back to 100% in C4), but Seleznov calls this a workaround, not a fix.
Anthropic's rules on descriptions
- Third-person mandatory. Best-practices §"Writing effective descriptions" warning block: inconsistent point-of-view causes discovery problems.
- Both what and when to use.
- Specific keywords; front-load the key use case.
- Anthropic's skill-creator says descriptions should be "a little bit pushy" to fight Claude's tendency to undertrigger.
- But also: "If you find yourself writing ALWAYS or NEVER in all caps… that's a yellow flag." A real tension with Seleznov; the resolution is to permit ALWAYS in the description's directive frame, flag it elsewhere.
The directive saturation risk
If ALL skills use 'ALWAYS invoke' language with overlapping triggers, the directive may lose force through dilution. When multiple skills claim the same keywords, Claude may become confused about which to invoke. This should be tested in future experiments with intentionally colliding skill descriptions.— Seleznov, §"Limitations"
No controlled test exists yet. Open opportunity.
What to correct
Adding more keywords improves activation. Variant B (keyword-stuffed but passive) showed no improvement over baseline (85% vs 87%).
Hooks always improve activation. They COLLAPSE passive descriptions (87.5% → 37%). They only help directive descriptions, marginally.
Description quality matters less than skill body quality. Description is the entire activation gate. Body quality is irrelevant if Claude never invokes the skill.
Sources for this module. Seleznov 650-trial Medium · Shcheglov local docx · Anthropic best-practices · code.claude.com/skills · skill-creator plugin source.
Two reliability problems — activation AND execution.
Bara's thesis
Both look the same to the user. You get a result that missed something the skill was supposed to catch. But they are different problems.— Marc Bara, "Claude Skills Have Two Reliability Problems, Not One"
- Activation failure. Claude does not invoke the skill at all and defaults to its own approach. Seleznov's territory. Fix: directive template (Module 5).
- Execution failure. Claude loads the skill but skips internal procedural steps that delay output without producing visible content.
The mechanism of execution failure
Three factors:
- The user request sits at the end of the context window — recency means strongest attention weight.
- Skills are further back, procedural, and meta-level.
- RLHF reinforced the pattern of producing the requested output directly.
Verification steps lose because they delay output, add no visible content, and sit at maximum distance from the initial prompt.
The concrete example
Bara asked Claude to draft a charter using a skill that required milestone verification. Claude produced output in 40 seconds. One of the milestones referenced a deliverable that was explicitly listed as out of scope two sections above. The rule that would have caught it was ten lines from the end of the skill Claude had just read.
A self-diagnosis from a Claude instance
My default mode always wins because it requires less cognitive effort and activates automatically.— Claude self-diagnosis, GitHub issue #7777, quoted by Bara
The fix — verbatim transformation
Before (skippable):
Before delivering, verify that every milestone aligns with the stated scope.
After (visible-output required):
Do not deliver the final charter until you have output a verification block listing each milestone and the in-scope deliverable it maps to. Flag any milestone that references an out-of-scope item.
The principle: apply the same negative-constraint pattern Seleznov uses for descriptions, inside skill steps. Force visible output from verification steps.
The open problem
No one has run a controlled experiment on step-level execution the way Seleznov did for activation. The 650-trial methodology exists. The step-execution version of that experiment does not, yet.— Bara
Strategic opportunity: the first published controlled execution-fidelity experiment becomes the methodology of record for that dimension.
Bara's six anti-patterns
- Passive / suggestion wording in descriptions
- Late-stage procedural steps without visible output
- Assuming output completeness implies process compliance
- Relying on reminder-style hooks
- Missing the negative-constraint component
- Treating activation and execution as one problem
What to correct
If the skill activated, it ran correctly. The failure mode is not a crash. It is a quiet omission that looks like completed work.
Trigger rate is the only thing to measure. Two axes: activation AND execution. Tools that measure only the first miss the worse failure mode.
Sources for this module. Marc Bara Medium (March 2026) · Seleznov · Shihipar (Gotchas section).
The Skills API — programmatic surface.
The eight beta endpoints
POST /v1/skills— create skillGET /v1/skills— list skills (limit max 100)GET /v1/skills/{skill_id}— retrieve metadataPOST /v1/skills/{skill_id}/versions— create new versionGET /v1/skills/{skill_id}/versions— list versions (limit max 1000)GET /v1/skills/{skill_id}/versions/{version}— retrieve version metadataDELETE /v1/skills/{skill_id}/versions/{version}— delete versionGET /v1/files/{file_id}/content— download generated file
The three required betas
code-execution-2025-08-25— code execution (cookbook version; API ref says2025-05-22— discrepancy unresolved)skills-2025-10-02— Skills APIfiles-api-2025-04-14— file operations
Container parameter shape
container={
"id": "optional-container-id-for-reuse",
"skills": [
{"type": "anthropic", "skill_id": "pptx", "version": "latest"},
{"type": "custom", "skill_id": "skill_01ABC...", "version": "1759178010641129"}
]
}
Hard limit: 8 skills max per request.
Upload constraints
- 30 MB total upload size for custom skill creation.
- Multipart upload via SDK helper
files_from_dir()— schema not publicly documented. Use the SDK. - ZIP must contain skill folder as the root. Not files at the zip root.
- SKILL.md at top level of the uploaded folder.
- When you upload, the API auto-extracts three pieces of metadata: name and description from your SKILL.md frontmatter, and directory from the top-level folder name. You don't pass them as separate parameters — they're derived from the upload itself. Implication: invalid frontmatter at upload time means silent metadata corruption.
Version semantics
- Anthropic-managed: date-based, e.g.
20251013. Supports"latest". - Custom: Unix epoch timestamp, e.g.
1759178010641129. Supports"latest". - Versions are immutable once created.
- Delete is per-version. Deleting the parent skill requires a separate
client.beta.skills.delete()call after all versions are deleted.
The critical architectural finding
No API endpoint returns SKILL.md content. Even versions/retrieve returns only metadata (id, created_at, description, directory, name, skill_id, type, version). Once uploaded, the file body is not exposed back. Implication: any tool that needs to re-validate uploaded skills must keep a local source-of-truth copy.
Other facts that matter
- Multi-turn pattern: pass
response.container.idback ascontainer.idin subsequent requests. - Long-running operations: handle
pause_turnstop_reason in a loop. - Pagination: cursor-based via
next_pagetoken;has_moreflag. - Skills feature is not eligible for Zero Data Retention (ZDR). Material for enterprise prospects with strict data-handling policies.
- Anthropic ships 22 distinct beta features total. One never-elsewhere-mentioned:
advisor-tool-2026-03-01.
What to correct
I can retrieve my uploaded SKILL.md from the API. No. Metadata only. Keep local copies.
Skills API publishes to a public marketplace. No. CRUD-within-workspace only. No discovery surface.
Versioning is semver-compatible. No. Unix epoch timestamps for custom; date-based for Anthropic.
Sources for this module. build-with-claude/skills-guide · all eight beta endpoint docs · files/download.
Cookbooks and the four Anthropic-managed skills.
The four Anthropic-managed skills
pptx, xlsx, docx, pdf. Use type: "anthropic", version typically "latest" (date-based). The full Anthropic-managed catalog.
Same tech, three surfaces
- API — developer-facing, accessed via
container.skills. - Claude.ai Creates Files — consumer-facing. Launched Sept 9 2025 preview, Oct 21 2025 GA. "Transforms Claude from advisor to active collaborator."
- Cookbook 02 — educational, financial-vertical examples.
"These are the same Skills that power Claude Creates Files." One skills tech, three surfaces (per Cookbook 02 verbatim).
Cookbook 02 — financial use cases
- Financial Dashboard Creation (2-sheet Excel + 4-slide PowerPoint + PDF)
- Portfolio Analysis (holdings + sector analysis with Sharpe ratio, beta, VaR, max drawdown, alpha)
- Automated Reporting Pipeline (Excel → PowerPoint → PDF)
All use Anthropic-managed skills only. No custom skills demonstrated.
Cookbook 03 — custom development
Demonstrates 3 example skills: Financial Ratio Calculator, Corporate Brand Guidelines, Financial Modeling Suite. Confirms SKILL.md is the ONLY required file and All .md files in the root directory will be available to Claude when the skill is loaded.
Time and token economics
- Excel with charts/formatting: 1–2 minutes.
- PowerPoint 2 slides + chart: 1–2 minutes.
- Simple PDF: 40–60 seconds.
- 3-document pipeline: 2–3 minutes total.
- Total tokens per file-generating request: 6,000–10,000 typical.
The reliability constraint
2-3 sheets per workbook works reliably and generates quickly.— Cookbook 02
Beyond that, segment into multiple files using a pipeline pattern.
An easy mistake
Must use client.beta.messages.create(), not client.messages.create(). The container parameter is unrecognized by the non-beta endpoint. Cookbook 01 calls this Issue #2 in troubleshooting.
Files have limited lifetime
Files generated by skills sit in the container temporarily. Download immediately via the Files API. Exact TTL not documented; Cookbook 01 just says limited lifetime.
What to correct
Anthropic ships dozens of pre-built skills. Just four: pptx, xlsx, docx, pdf.
Cookbook 02 demonstrates custom skills. It uses only Anthropic-managed skills. Custom development is Cookbook 03.
Files persist on Anthropic servers. Limited lifetime. Download immediately.
Sources for this module. Cookbook 01 · Cookbook 02 · Cookbook 03 · create-files announcement · financial-services plugins doc.
Skills as plugins, connectors, MCPs, and the distribution surface.
The taxonomy
Each plugin bundles together skills, connectors, and sub-agents into a single package.— support 13837440
| Component | What it is |
|---|---|
| Skill | Individual capability invoked via / or + |
| Connector | External service link (Drive, Gmail, Slack, DocuSign…); requires auth |
| Sub-agent | Autonomous assistant for delegated subtasks |
| MCP server | Open-standard external system integration (managed via /mcp) |
| Plugin | Container that bundles the above |
Org-admin provisioning (Team / Enterprise)
Organization settings > Skills > "+ Add" > select .zip file. The skill is immediately provisioned to all users in your organization.
There's no approval workflow for org-wide sharing. If you enable Share with organization, any member can publish a skill to the directory without review.— support 13119606
Audit log captures share events (as role_assignment events) but not skill content. There's no admin dashboard to browse or inspect the contents of skills shared between members. Users can disable provisioned skills but not delete them.
Third-party platform deployments are different
On Bedrock, Vertex AI, Azure AI Foundry, or LLM gateways, the marketplace is gone:
The skills and plugin marketplace available in Claude Enterprise isn't available with third-party platforms.— support 14680753
Distribution is via local filesystem mounts and MDM. macOS path: /Library/Application Support/Claude/org-plugins/. Windows: C:\ProgramData\Claude\org-plugins\. MCP servers configured via managedMcpServers MDM key.
Anthropic ships five financial-services plugins
- Financial Analysis (core; required first install)
- Investment Banking
- Equity Research
- Private Equity
- Wealth Management
Partners: LSEG, S&P Global + Daloopa, Morningstar, FactSet, Moody's, MT Newswires, Aiera, PitchBook, Chronograph, Egnyte. Open-source via github.com/anthropic/financial-services-plugins. Plugins free; data connectors require separate subscriptions.
Office add-ins
Native Microsoft Office add-ins (Excel, PowerPoint, Word). Pro/Max/Team/Enterprise. Critical limitations: Claude can only read from and write to files that are currently open; cannot create, open, close, or switch files; cross-app session chat history not saved between sessions.
What to correct
Skills, connectors, and plugins are different names for the same thing. Plugins CONTAIN skills + connectors + sub-agents.
Cowork plugin marketplace is open to community submissions. Anthropic-curated. No documented submission path.
Bedrock / Vertex / Azure deployments get the same plugin marketplace. No — local-mount + MDM only.
Org-admin provisioned skills are reviewed before deployment. No approval workflow exists.
Sources for this module. support 13837440 · 14680753 · 13119606 · 13851150 · 13892150 · 14328846.
Claude Design and the DESIGN.md connection.
Launch facts
Launched April 17, 2026. Anthropic Labs product. Powered by Claude Opus 4.7 ("most capable vision model"). Available on Pro/Max/Team/Enterprise. Enterprise is default-off; admins enable in Organization settings.
UX surface
Web app at claude.ai/design. Chat interface left, canvas right. Inline comments on the canvas. Sharing: shareable link with view-only / comment / edit access within the organization.
Six capability categories
- Realistic prototypes (interactive, shareable)
- Product wireframes and mockups
- Design explorations (multiple directions in parallel)
- Pitch decks and presentations
- Marketing collateral (landing pages, social, campaigns)
- Frontier design — code-powered prototypes with voice, video, shaders, 3D, built-in AI
Two paths to brand-aware output
Path 1 — DESIGN.md upload (Hassid's individual workflow):
- Extract brand system via Cowork → DESIGN.md
- Upload DESIGN.md to Claude Design as persistent context
- Generate designs (specifying goal, layout, content, constraints)
- Iterate (chat for layout, canvas for pixel-level)
- Validate (accessibility, responsive, A/B variations)
- Export
The verbatim Cowork extraction prompt:
Analyze this folder and produce a full design system write-up. Fonts, colors, graphical styles, component patterns, tone, layout conventions. Flag anything that's missing. Save it as DESIGN.md in my folder.— Hassid, Substack
Why DESIGN.md matters as persistent context:
Drop your DESIGN.md as context. Every future prompt applies it automatically. You never re-specify colors or fonts again.— Hassid
Without it: you'll see the same design everywhere.
Path 2 — Built-in org-level design system integration (Team/Enterprise only):
Organizations configure brand colors, typography, and component patterns once; all projects automatically inherit these assets without manual uploads.— support 14604416 (Get Started with Claude Design)
Implication: DESIGN.md upload is the strongest play for individual Pro/Max users and small teams. Team/Enterprise users have built-in extraction.
The DESIGN.md format itself
Google open-sourced the spec in April 2026 at github.com/google-labs-code/design.md. Nine prescribed sections plus YAML frontmatter:
- Visual Theme & Atmosphere
- Color Palette & Roles
- Typography Rules
- Component Stylings
- Layout Principles
- Depth & Elevation
- Do's and Don'ts
- Responsive Behavior
- Agent Prompt Guide — directive layer; where Seleznov's research transfers
Anthropic's silence
The Claude Design announcement does not mention DESIGN.md, Stitch, or skills by name. Strategic, not oversight.
Competing tools (per Hassid)
- Gamma — preferred for slide diversity / simplicity.
- Figma — lost $730M valuation right after the news of the Claude Design announcement.
- Canva — integration partially broken in Claude Design.
- getdesign.md — static repository hosting hand-curated DESIGN.md files (Mastercard, Airbnb, Ferrari per Hassid). Direct competitor on the consumption side.
Hassid's known limitations
- "Send to Canva" button broken
- Very buggy, left and right
- Uses tokens extremely fast
- Lacks fine-grained user control
Sources for this module. Claude Design announcement · Get Started with Claude Design · Hassid Substack · Yeh Medium · Google's design.md spec · VoltAgent/awesome-design-md (69 brands).
The critical empirical sources — failure-mode mastery.
Shcheglov's audit
- ~80% of public skills perform below baseline.
- 40 of 47 skills from a viral list scored below baseline.
- #1 failure mode: author lacks domain expertise.
- Memorable example: a CFO review skill by a developer who thinks EBITDA is a type of pasta.
- Most-installed public skills (find-skills @ 418k installs, vercel-react-best-practices @ 176k, web-design-guidelines @ 137k) are framework-specific and ruthlessly opinionated. Not "make Claude a better thinker."
The failure-mode taxonomy (Beehiiv aggregator)
Over-constraining. Skills that teach Claude HOW TO THINK compete with model training. Wastes tokens.
Model obsolescence dichotomy:
- Capability uplift skills — teach techniques Claude will eventually master (e.g. "how to write a good email"). EXPIRE with model upgrades.
- Encoded preference skills — document org-specific business logic (e.g. "your team routes Severity 1 tickets to #critical-ops within 15 minutes"). GAIN VALUE as models improve, because the business knowledge is durable.
The canonical example, verbatim:
Teaching Claude how to write a good email is capability uplift. Telling Claude that your team routes Severity 1 tickets to #critical-ops within 15 minutes...is encoded preference.— Less Clicks aggregator
Token economics, verified
10 skills ≈ 1,000 tokens vs 5 MCP servers ≈ 55,000 tokens. Skills are ~55× cheaper per unit for context cost.
Aggregator's strongest claims
- One in four runs with the default description, your carefully built skill gets ignored. (Quantifies the 77% activation as 25% production failure.)
- Skills are cheap. But a bloated skill with redundant instructions wastes the one resource that matters: the space Claude has to think.
- The happy path is the part Claude already handles well. The exceptions are what it needs from you.
- If a new hire would need weeks to absorb this through trial and error, it belongs in a skill.
- Skills are grown, not built.
Shihipar's insider perspective (Anthropic Claude Code team — verified)
- Hundreds of skills in active use at Anthropic.
- The highest-signal content in any skill is the Gotchas section.
- A skill is a folder, not just a markdown file.
- The description field is not a summary — it's a description of when to trigger this skill.
- Data stored in the skill directory may be deleted when you upgrade the skill. Solution:
${CLAUDE_PLUGIN_DATA}.
Anthropic's internal taxonomy (9 categories with named examples)
| Category | Example skills |
|---|---|
| Library & API Reference | billing-lib, internal-platform-cli, frontend-design |
| Product Verification | signup-flow-driver, checkout-verifier, tmux-cli-driver |
| Data Fetching & Analysis | funnel-query, cohort-compare, grafana |
| Business Process | standup-post, create-<ticket-system>-ticket, weekly-recap |
| Code Scaffolding | new-<framework>-workflow, new-migration, create-app |
| Code Quality & Review | adversarial-review, code-style, testing-practices |
| CI/CD & Deployment | babysit-pr, deploy-<service>, cherry-pick-prod |
| Runbooks | <service>-debugging, oncall-runner, log-correlator |
| Infrastructure Operations | <resource>-orphans, dependency-management, cost-investigation |
Yeh's three-layer architecture (designer-organized skills)
- Layer 1 — Reference Skills (knowledge). design-principles, component-token-specs, content-strategy, motion-specs, product-area-design.
- Layer 2 — Capability Skills (workflows). read-filter-prd, create-design-ticket, define-flow, research, sell-design-vision, identify-ux-problems, generate-design, generate-prototype, design-review, qa-signoff.
- Layer 3 — Tools / MCPs. Figma REST API, rendering plugin, TalkToFigma, project management.
What to correct
All skills age out at the same rate. Capability uplift expires; encoded preference durably gains value.
Anthropic only uses skills externally. They use hundreds internally across 9 categories.
Designers should organize skills the same way developers do. Yeh's three-layer pattern is designer-specific.
Sources for this module. Shcheglov local docx · Beehiiv aggregator · Shihipar LinkedIn · Yeh Medium.
Strategic synthesis — where Enact wins.
The wedge in one line
Generate skills from your sources, not Q&A. Enforce 86 quality checks visibly during generation. Test for activation AND execution. Refuse below threshold with citable reasons.
Eval differentiation vs Anthropic's skill-creator
The honest framing: skill-creator is real, capable, Anthropic-shipped at TWO surfaces (Claude Code plugin + Claude.ai conversational meta-skill) with a real eval framework. Enact differentiates on methodology and scope, not "we have it, they don't."
| Dimension | skill-creator | Enact |
|---|---|---|
| Generation flow | Q&A interview | Source-content (video, PDF, URL, Figma) |
| Eval style | Human-in-the-loop, iterate until satisfied | Automated Refusal Gate, refuse-and-explain |
| Multi-model cross-test | Single model per loop | Auto Haiku + Sonnet + Opus on every run |
| Hook-robustness test | Not built-in | Seleznov-derived eval condition |
| Execution-fidelity test | Not built-in | Bara-derived (Enact-original methodology) |
| Spec-drift monitoring | None | BG-A Sync polls daily, re-validates library |
| Post-install monitoring | None | BG-B Monitor tracks fire rates, missed triggers |
| Visible during-generation checklist | Eval after generation | 86-item checklist auto-validated during generation |
Cross-platform claim
Enact-generated skills follow the open standard adopted by 36+ tools — Anthropic, Google, OpenAI, Microsoft, JetBrains, ByteDance, Snowflake, Databricks, Block. Built on the agentskills.io open standard is defensible copy.
The four gaps Enact addresses
- Domain expertise gap. Source-driven generation imports expertise from speakers/authors who have done the work. Doesn't fabricate it from user Q&A.
- Activation reliability gap. Hardcoded directive template + measured trigger rate + hook-robustness test.
- Execution fidelity gap. Bara-derived check (a new dimension nobody else measures). Open research opportunity.
- Discovery / quality signal gap. Library tracks per-skill metrics. Stars don't predict quality.
Strategic risks to monitor
- Anthropic ships their own DESIGN.md generator. Engineering blog signals: enable agents to create, edit, and evaluate Skills on their own.
- Claude Design's built-in org-design-system displaces DESIGN.md uploads at the Team/Enterprise tier.
code-executionbeta version drift between API ref and cookbook.- Directive saturation (Seleznov flagged future failure mode).
advisor-tool-2026-03-01— beta with no docs found yet.- ZDR non-eligibility blocks some enterprise prospects.
- Open standard evolves and breaks compatibility.
Three honest claims for the landing page
- Eighty-six checks. Every refusal explained. Drift detected within hours.
- We measure both activation and execution. Skills that fire 100% can still skip silent verification steps. Most tools measure neither.
- Built on Stitch's open DESIGN.md format and the agentskills.io open standard. Compatible with Claude Code, Claude Design, Cursor, Codex, and more.
What to correct (Enact-specific)
Enact competes with Anthropic. Enact is symbiotic with Claude Design and Claude Code — we supply the fuel.
Eval is the moat. Methodology + scope is the moat. Anthropic ships eval too. We measure things they don't.
"100% perfect skills every time." Provably false. Honest claim is "every check shown, every refusal explained."
Sources for this module. Synthesis across all 42 sources. See SOURCE_NOTES.md "Implications for Enact" subsections per source.
When you're ready
Take the quiz.
Fifty questions. You self-grade. The point is to know what you don't know yet.
Begin the quiz