All posts
Guides

The Complete Guide to Claude Code Skills in 2026

What Claude Code skills are, how they differ from slash commands and agents, and how ClaudeKit's 5 kits (82,197 tokens measured) use them to ship real work.

Updated 14 min read
The Complete Guide to Claude Code Skills in 2026

A Claude Code skill is a folder containing a SKILL.md file that teaches Claude how to do one specific job, loaded into context only when the model decides the job is relevant. You install skills once; the model reaches for the right one automatically based on a description you write. No invocation required. ClaudeKit ships 19 skills across 5 kits, totaling 82,197 measured tokens — and every one of them follows the discipline described in this guide.

This is the canonical walkthrough: anatomy, frontmatter, the description trigger, progressive disclosure, the v2 architecture that replaced reviewer gates, and the mistakes we see most often.

What exactly is a Claude Code skill?

Mechanically, a skill is a directory. At minimum it holds one file:

my-skill/
└── SKILL.md

SKILL.md is Markdown with a YAML frontmatter block at the top. The frontmatter tells Claude Code the skill's name and — critically — when to use it. The Markdown body is the instructions, procedure, examples, and constraints the model follows once the skill is active. Skills live under ~/.claude globally or inside a project, and Claude Code discovers them at startup.

The key difference from a slash command or an agent: you do not call a skill directly. The model autonomously decides a skill is relevant by matching your request against the skill's description, then loads the body. We unpack the three-way distinction in our comparison guide; for now, hold onto "skills are model-triggered capabilities."

The Agent Skills open standard, finalized December 18, 2025, codified this pattern and was adopted by 32+ tools in its first 20 days. Today roughly 90,000 skills exist on skills.sh. The ecosystem moved fast, which is why skill quality discipline — specifically the description field — matters more now than it did a year ago.

What is inside a SKILL.md?

Here is a minimal but complete SKILL.md:

---
name: changelog-writer
description: >
  Write a user-facing changelog entry from a list of merged PRs or commits.
  Use when the user asks to draft release notes, a changelog, or "what changed
  this release". Not for internal commit messages.
---
 
# Changelog Writer
 
When invoked, produce a changelog entry that groups changes into
Added / Changed / Fixed / Removed, written for end users, not developers.
 
## Procedure
1. Read the provided PRs or commits.
2. Drop purely internal changes (CI, refactors with no user impact).
3. Group the rest under the four headings, newest first.
4. Lead each line with a verb; keep each under 120 characters.
 
## Output
A Markdown section starting with `## <version> — <date>`.

Two parts: the frontmatter (the contract with the model) and the body (the instructions). Everything below the closing --- is only read when the skill fires. This split is the reason skills stay cheap.

What does each frontmatter field do?

The frontmatter is small but load-bearing. Fields you will use:

  • name — a short, stable identifier. Lowercase, hyphenated. This is how the skill is referenced internally and how its tokens appear in a cost ledger.
  • description — the trigger. This is the field the model reads to decide whether to load the skill. It gets its own section below because it is that important.
  • allowed-tools (optional) — restrict which tools the skill may use. Read-only auditor skills should never write files; locking this down makes behavior predictable.

Keep the frontmatter tight. Every byte here is always loaded — the name and description sit in context for every installed skill so the model knows the skill exists. A bloated description is a tax you pay on every session.

Why is the description the most important line you will write?

If you take one thing from this guide: the description field decides whether your skill ever runs. Claude Code matches the user's request against every installed skill's description. A vague description means the skill never fires when it should, or fires when it should not.

Write descriptions that are explicit about when to use and, ideally, when not to:

# Weak — too abstract, will not trigger reliably
description: Helps with SEO content.
 
# Strong — names the trigger conditions and the boundary
description: >
  Write an answer-first content brief with target intent, entity coverage,
  SERP gaps, and internal-link slots. Use when the user asks for a content
  brief, an outline to rank, or "what should this page say". Not for editing
  already-published copy — use the on-page optimizer for that.

The strong version names concrete trigger phrases ("content brief", "what should this page say") and draws a boundary. That boundary is what stops two overlapping skills from fighting over the same request. In ClaudeKit's SEOKit, skills like seo-write and seo-audit deliberately carve non-overlapping trigger language so the model routes cleanly.

How does progressive disclosure keep skills cheap?

The body of a SKILL.md loads in full when the skill triggers. The rule is: put the minimum viable procedure in SKILL.md, and push long reference material into separate files the skill reads only if it needs them.

content-brief/
├── SKILL.md          # ~1,200 tokens: procedure + when to read references
├── references/
│   ├── entity-taxonomy.md   # loaded only when entity coverage is in play
│   └── aeo-block-formats.md # loaded only when writing answer blocks
└── templates/
    └── brief-template.md

The SKILL.md says in effect: "for the entity step, read references/entity-taxonomy.md." If a given run never reaches that step, those tokens are never spent. We measured typical skill bodies across all five ClaudeKit kits and they stay in the 600-1,500 token band precisely because heavy material lives in references. The full methodology is in our token-cost dataset.

Progressive disclosure is not just a cost trick. A 4,000-token wall of instructions buries the one paragraph that matters for the current step. A 900-token core plus targeted references keeps attention where it belongs.

How do ClaudeKit's 5 kits use skills?

ClaudeKit ships five production kits as of June 2026, each with a measured token footprint. Here is the full breakdown:

KitCommandsSkillsAgentsTokens
EngineerKit (/eng)254420,413
MarketingKit (/mkt)203216,714
SEOKit (/seo)194216,004
EcomKit (/ecom)203216,464
VideoKit (/video)175312,602
Totals101191382,197

Every kit follows the same architecture: commands (slash workflows you invoke) plus skills (knowledge the model auto-loads) plus read-only specialist agents (reviewer, auditor, researcher). No orchestrator agents. No blocking reviewer gates. No runnable Python tools.

EngineerKit: the daily-8 workflow

EngineerKit's 4 skills support a daily-8 command sequence: catchup, plan, tdd, debug, verify, review, commit, handoff. The flagship is /eng debug — it runs root-cause-first diagnosis, not symptom triage. The skills auto-load context about code conventions, debugging heuristics, and commit standards when relevant. At 20,413 tokens it is the heaviest kit; that weight buys you a complete engineering workflow with no context-setup overhead.

MarketingKit: voice and humanize as flagships

MarketingKit's two flagships are /mkt voice (builds a voice file from your real posts) and /mkt humanize (strips 14 measurable AI tells from copy). The 3 skills auto-load brand context, tone calibration, and format patterns. /mkt repurpose turns one piece of content into 5 formats; /mkt calendar plans a multi-week schedule from a product brief. At 16,714 tokens it is the second largest kit.

SEOKit: citations and quick-wins

SEOKit's two flagships are /seo quick-wins (surfaces positions 8-20 and low-CTR pages from GSC data) and /seo citations (runs N AI-citation measurements with confidence intervals). With AI Overviews now on 48% of Google queries (March 2026, up from 34.5% December 2025) and 47% of AIO citations coming from below position 5, the citations command fills a real gap in standard tooling. The 4 SEO skills auto-load algorithm context, entity coverage rules, and extractable-format patterns.

EcomKit: store triage and margin defense

EcomKit's flagship is /ecom no-sales — it runs a store triage against AOV-band benchmarks so you know whether low revenue is a traffic problem, a conversion problem, or a margin problem before touching anything. Supporting commands include /ecom cart-recovery, /ecom amazon, /ecom margin, /ecom bfcm. The 3 skills auto-load pricing psychology, platform-specific constraints, and benchmark data.

VideoKit: clone a reference style

VideoKit's flagship is /video clone — it recreates a reference video's style in Remotion and verifies the match. The 5 skills auto-load Remotion component patterns, caption formatting rules, and platform dimension specs. Supporting commands: /video make, /video demo, /video caption, /video social, /video data. At 12,602 tokens it is the lightest kit.

What changed from v1 to v2 architecture?

The biggest architectural change in v2 was removing the reviewer-gate pattern. In v1, commands ended by handing off to a reviewer or quality-gate agent that blocked output until it approved. This created latency, unpredictable behavior, and a false sense of quality control.

In v2, every command ends with EVIDENCE — a report, a diff, or a verified file — not a reviewer gate. The read-only specialist agents (auditor, researcher) exist to gather information, not to block output. You see the result; you decide if it passes. This shift is described in detail in why we killed the reviewer gate.

The practical impact:

  1. Commands complete in a single pass, not two or three.
  2. Output is deterministic — same inputs produce the same evidence artifact.
  3. You retain control instead of delegating approval to an agent with opaque criteria.
  4. Skills stay read-only by default; the allowed-tools field enforces it.

How do you install and manage ClaudeKit skills?

Installation takes two commands:

ck auth <your-license-key>
ck install engineer   # or marketing, seo, ecom, video

Installing globally adds the kit to ~/.claude. Pass --local to install into the current project instead. A token ledger prints on every install so you see exactly what you are adding to context. You can also install via the plugin marketplace:

/plugin marketplace add Madni-Aghadi/claudekit-engineer

Three management commands worth knowing:

  • ck tokens <kit> — recount tokens for a kit (run after any edit)
  • ck doctor — diagnose install issues (missing files, path conflicts)
  • ck list — show your current entitlements

The CLI is claudekits v0.1.3 on npm. License covers 3 devices.

What folder layout should you use for your own skills?

A maintainable skill folder beyond the minimum:

my-skill/
├── SKILL.md          # required
├── references/       # long docs read on demand
├── templates/        # output scaffolds the skill fills in
└── scripts/          # optional helper scripts the skill may run

Three conventions worth adopting:

  1. One job per skill. If your description needs the word "and" twice, it is probably two skills.
  2. Name the skill after the job, not the tool. traffic-drop-diagnose, not gsc-helper.
  3. Reference files are nouns; the skill is a verb. The skill does something; references are the data it consults.

What are the most common skill-building mistakes?

From reviewing our own skills and hundreds of community submissions:

  1. Vague descriptions. The number-one reason a skill "does not work" is that it never triggers. Fix the description before you touch the body.
  2. Everything in SKILL.md. A 3,500-token body that should be a 900-token core plus references. You pay the full body on every trigger.
  3. Overlapping triggers. Two skills whose descriptions both claim the same request. The model picks unpredictably. Draw explicit boundaries with "not for..." language.
  4. No output contract. The body explains the procedure but never says what the skill should produce. Always specify the output shape — format, structure, what "done" looks like.
  5. Skill that should be a slash command. If the job is a multi-step workflow you always invoke explicitly, it belongs as a command, not a skill. See the agents vs skills vs commands guide.
  6. Ignoring the token count. Building a skill without measuring its tokens. Run ck tokens and treat anything over 1,800 tokens as a candidate for progressive disclosure.
  7. Reviewer gate in the body. Writing "have a reviewer agent check this" at the end of a skill body. V2 ends with evidence artifacts, not approval gates.

How do skills interact with agents in v2?

In ClaudeKit v2, agents are read-only specialists: they audit, research, and report. They do not run other commands, they do not block output, and they do not write files unless explicitly instructed by a command. Skills are the knowledge layer; agents are the research layer; commands are the workflow layer.

When you run /seo audit, the command orchestrates the seo-audit skill (auto-loaded knowledge) alongside a read-only auditor agent (gathers data, writes a report). The output is a markdown audit report — evidence you can act on. No approval queue. No second pass. The three layers stay separate because mixing them is where v1 went wrong.

"Claude Code specialist" job demand grew 938% between January and June 2026. The teams hiring for that role are building exactly this kind of layered architecture at scale. Getting the skill/command/agent distinction right from the start saves significant refactoring.

Putting it together

A good skill is small, single-purpose, triggered by a precise description, and structured so its heavy material loads only when needed. That combination is what lets a kit ship dozens of commands, 19 skills, and 13 read-only agents — 82,197 tokens total — without drowning any session in context.

If you are building your own skills, start with the description, keep the body lean, measure before you ship, and end every command with evidence. If you want a production-ready baseline, the ClaudeKit pricing page shows all five kits from $14.99/month per kit (14-day refund, 3 devices per license). Pick the kit closest to your current bottleneck and read its token ledger before you install — you will understand exactly what you are adding to context before you commit.


FAQ

What is a Claude Code skill?

A Claude Code skill is a folder containing a SKILL.md file that teaches Claude how to do one specific job. Claude loads the skill's instructions into context on demand, only when the model decides the job is relevant based on the skill's description field. You do not invoke skills by name — the model routes to them automatically.

What is the difference between a skill and a slash command?

A slash command is something you invoke explicitly by typing /command; a skill is something the model invokes automatically when its description matches your request. Skills are model-triggered knowledge; slash commands are user-triggered workflows that often orchestrate skills and read-only agents underneath them.

Why is my skill not triggering?

Almost always the description is too vague. Claude matches your request against the description to decide whether to load the skill. Rewrite it to name concrete trigger phrases and conditions ("use when the user asks to draft release notes"), and add a boundary for when not to use it. Specificity is the fix in 90% of cases.

How long should a SKILL.md body be?

Keep the body in the 600-1,500 token range for most skills. Push long procedures and reference material into separate files the skill reads only when needed — this is progressive disclosure. Anything over 1,800 tokens is usually a sign you should split heavy material out into a references/ folder.

What happened to FounderKit and SalesKit?

Both were shelved in the v2 rebuild. The v2 lineup is five kits: EngineerKit (/eng), MarketingKit (/mkt), SEOKit (/seo), EcomKit (/ecom), and VideoKit (/video). The old /founder and /sales namespaces are gone. If you were using either kit, MarketingKit covers most of the marketing and positioning workflows; EcomKit covers the commercial operations side.

How do I count tokens for my installed skills?

Run ck tokens <kit> to recount tokens for any installed kit. The token ledger also prints on every fresh install. For custom skills you build yourself, any tokenizer that follows the cl100k_base encoding (same as Claude Code's context) will give you an accurate count. Treat the token figure as part of the skill's public interface — if the body grows, the cost grows.

Give Claude Code a real team

Five kits, 101 commands, every token measured. Pick the team that matches your work and install it in five minutes.

See the kits

Keep reading