Claude Code Agents vs Skills vs Slash Commands: When to Use Which
Skills are model-triggered capabilities, agents are read-only specialists with their own context, slash commands are user-triggered workflows. Exactly when to use each.

The short answer: a skill is a capability the model triggers automatically, an agent is a read-only specialist running in its own isolated context, and a slash command is a workflow you invoke by name. The decision rule is one question: who decides when it runs? You decide — slash command. The model decides — skill. A parent agent decides and the work needs isolation — subagent. Most real systems compose all three. Everything below expands on why and when.
This is the definitive 2026 answer to the question Claude Code practitioners keep asking. We define each primitive precisely, give a decision table, walk through the real examples inside ClaudeKit's five kits, and cover the composition patterns that make all three work as a system.
What exactly is a Claude Code skill?
A skill is a SKILL.md file Claude loads on demand, automatically, when your request matches the skill's description field. You never type its name. The model routes to it the same way a human reaches for a reference — silently, when relevant.
Skills share the main conversation's context window. When triggered, the SKILL.md body (typically 400-900 tokens) loads inline. There is no isolation, no separate process. That makes skills cheap and frictionless but also means the description field has to be precise — a vague description causes the skill to fire at the wrong time or not fire at all.
The right unit of work for a skill is a single, well-bounded capability: "write a hook-first YouTube intro," "diagnose a layout shift," "compute margin at a given AOV." If you find yourself wishing Claude "just knew how" to handle a recurring task without you explaining it each time, that is a skill.
ClaudeKit ships 19 skills across the five kits. MarketingKit's mkt-humanize skill, for example, carries a specific list of 14 AI writing tells and rewrites against them without you asking — it triggers whenever the context is a piece of content that needs human voice. SEOKit's seo-extractable skill loads when content needs to be structured for AI citation. You can see the full inventory in the skills guide.
What exactly is a Claude Code agent?
An agent (subagent) is a separate Claude instance with its own system prompt, its own context window, and a narrow role. A parent orchestrator delegates a task, the agent works in isolation, and it returns a result. The parent's context never sees the agent's intermediate reasoning — only the output.
In ClaudeKit v2, all 13 agents are read-only specialists: reviewers, auditors, and researchers. They never write the primary deliverable. They audit it, score it, or surface data for it. This is a deliberate architectural choice — keeping agents as judges or data-gatherers rather than producers prevents the compounding errors that come from one agent's output becoming another's unreviewed input.
The defining trait of an agent is context isolation. An agent can read 40 files, run a crawl, and reason at length — and the parent only pays for the task prompt in and the structured result out. This is exactly why agents are the right primitive for work that is large, parallel, or benefits from a distinct adversarial perspective.
EngineerKit's review agent, for example, reads a diff against a set of correctness criteria and returns a structured findings list. It has never touched the code it is reviewing. That separation between producer and judge is the whole point. See the SEO architecture post for a worked example of agents fanning out in parallel.
What exactly is a Claude Code slash command?
A slash command is something you type: /eng debug, /seo quick-wins, /video clone. It is a user-triggered, repeatable workflow. You invoke it; it runs a defined sequence.
Underneath, a well-built slash command is an orchestrator. It sequences skills, delegates to agents, enforces structure, and emits a concrete artifact at the end. The command is the front door; skills and agents are what get called inside.
ClaudeKit ships 101 commands across five kits. Each ends with evidence — a report, a diff, a verified file — not a vague summary. /eng debug produces a root-cause analysis with the fix already applied. /seo quick-wins produces a ranked, prioritized table of positions 8-20 and low-CTR pages. /ecom no-sales produces a triage report benchmarked against AOV-band data. The artifact is the point.
Slash commands are also where you express workflow preferences that do not belong in a skill. The /mkt voice command generates a voice reference file from your real posts — a single-run setup command, not a recurring capability. That is a command, not a skill.
Decision table: agents vs skills vs slash commands
| Question | Skill | Agent | Slash command |
|---|---|---|---|
| Who triggers it? | The model, automatically | The parent orchestrator | You, by typing /... |
| Own context window? | No — shares main context | Yes — fully isolated | No — runs in main, may spawn agents |
| Role in v2 ClaudeKit | Reusable capability | Read-only reviewer / auditor / researcher | Multi-step workflow with defined output |
| Token footprint | 400-900 tokens per trigger | Isolated — parent only pays for input + result | Variable; depends on workflow length |
| You invoke by name? | Never | Rarely | Always |
| Example | mkt-humanize (strips AI tells) | SEO audit reviewer | /seo quick-wins |
| Output type | Inline transformation | Structured findings / scored deliverable | Report, diff, verified file |
When should you reach for a skill vs an agent?
The cleanest separator is the context question: does this work need its own context window?
If no — it is a single capability, it runs inline, it adds one specific thing to what the model knows how to do — that is a skill. Keep it under 1,000 tokens, write a precise description, and let the model route to it.
If yes — the sub-task is large, will flood the main context with intermediate junk, needs to run in parallel with other agents, or demands an adversarial perspective on output it did not write — that is an agent. The clearest signal for an agent is the producer/judge separation: if you want something reviewed by a different persona than the one that wrote it, you need an agent. A skill cannot do that; it shares the context of the producer.
Four cases where agents win over skills every time:
- The sub-task involves reading 20+ files and the intermediate reasoning would fill half your context budget.
- You want parallel workstreams — five agents auditing five content pillars simultaneously rather than sequentially.
- You need an adversarial reviewer that has never touched the deliverable it is judging.
- The sub-task needs a different domain persona (e.g., a security auditor with a different risk tolerance than the feature developer).
Four cases where skills win over agents:
- The capability is a single, well-scoped procedure that the model should apply inline.
- You want zero overhead — no delegation, no context switch, no structured handoff.
- The task is triggered by relevance, not by explicit invocation (the model should just "know" to do it).
- The output feeds directly back into the main conversation, not into a separate artifact.
How do slash commands relate to the other two?
Slash commands compose the other two. The standard ClaudeKit pattern:
/seo quick-wins
1. Load seo-data skill (auto-triggered by context)
2. Pull positions 8-20 from Search Console
3. Pull low-CTR pages (impressions > 500, CTR < 2%)
4. Rank by estimated traffic delta
5. Run seo-check skill on top candidates
6. Emit ranked table with priority, effort, and next action
No agent needed here — it is a sequential skill-plus-tool workflow. Compare that to /seo audit, which is heavier and does fan out to specialist agents for technical, content, and AEO dimensions in parallel.
The right question is not "should I use an agent or a slash command" — they are not alternatives. A slash command decides what to run; an agent is one of the things it might run. Commands compose agents; agents use skills; skills do one thing.
How does context cost work across all three?
This is where most practitioners get tripped up, and it drives real token spend.
Skills add their body to your main context every time they are triggered. If you have 8 skills and all eight fire in a session, you have loaded 4,000-7,000 extra tokens into the main window. That is fine until you are deep in a long conversation — then it compounds. ClaudeKit publishes a token ledger for every kit. EngineerKit loads 20,413 tokens total (25 commands + 4 skills + 4 agent system prompts); you pay those only when each is triggered, not all at once.
Agents run in their own context. The parent pays only for the task description and the structured result. If an audit agent reads 50 files and reasons for 8,000 tokens, your main window sees maybe 300 tokens of result. That is the cost leverage of agents for heavy sub-tasks.
Slash commands orchestrate in the main context but can delegate messy sub-work to agents, keeping the main window clean. A well-built command is the difference between a 40,000-token session and a 12,000-token one. We measure this in the token cost post.
If you are fighting context exhaustion, the fix is usually "move heavy sub-tasks into agents," not "delete skills." Use ck tokens <kit> to recount your current footprint, and ck doctor to diagnose if something looks off.
What does this look like in practice across the five kits?
Here are the primitives in use across real ClaudeKit workflows:
EngineerKit (/eng) — 25 commands, 4 skills, 4 agents, 20,413 tokens
The daily eight: /eng catchup, /eng plan, /eng tdd, /eng debug, /eng verify, /eng review, /eng commit, /eng handoff. The flagship is /eng debug — root-cause-first, applies the fix, emits a verified diff. The 4 read-only agents are reviewers: they audit diffs and PRs against criteria they never wrote. Skills auto-load context (test patterns, commit formats) without you asking.
MarketingKit (/mkt) — 20 commands, 3 skills, 2 agents, 16,714 tokens
The mkt-voice skill is generated once by /mkt voice (which reads your real posts to build a voice reference), then auto-loads forever after. mkt-humanize carries 14 specific AI tells and strips them — triggered automatically when content is in context. /mkt repurpose takes one piece of content and emits 5 formats. The 2 agents handle brand-voice audits they did not write.
VideoKit (/video) — 17 commands, 5 skills, 3 agents, 12,602 tokens
Flagship: /video clone — analyze a reference video's structure and style, recreate it in Remotion, verify the match. /video demo generates a product demo. /video caption adds captions. The 5 skills carry Remotion patterns, caption formatting, and hook structures. The 3 agents audit finished videos against brief criteria.
SEOKit (/seo) — 19 commands, 4 skills, 2 agents, 16,004 tokens
Flagship: /seo quick-wins (positions 8-20 + low-CTR table) and /seo citations (AI-citation measurement with confidence intervals across N runs). /seo pseo handles programmatic SEO at scale. Skills auto-load when content needs AEO optimization or technical audit context.
EcomKit (/ecom) — 20 commands, 3 skills, 2 agents, 16,464 tokens
Flagship: /ecom no-sales — triage a store against AOV-band benchmarks to find the actual conversion blocker. Also: /ecom cart-recovery, /ecom amazon, /ecom margin, /ecom ads, /ecom bfcm. Skills carry benchmark data and email sequence logic.
What changed from v1 to v2 architecture?
The v1 architecture used orchestrator agents and blocking reviewer/quality-gate agents that had to score a deliverable above a threshold before the workflow could proceed. We shelved that pattern entirely. The problems:
- Blocking reviewers added latency and token cost without reliable quality signal
- Orchestrator agents added a layer of delegation that was mostly indirection
- Commands ended with a reviewer's verdict, which felt like process theater more than evidence
The v2 rule: commands end with evidence, not verdicts. A finished /eng debug run produces a root-cause analysis and an applied diff you can read. A finished /seo quick-wins run produces a ranked table with CTR and impression data. You judge the output. The agents that exist in v2 are reviewers in the sense of "read-only auditors that surface findings" — they never block, they report. You decide whether to act.
This also means FounderKit and SalesKit did not make the cut for v2. Both leaned heavily on the orchestrator pattern we moved away from. The five shipped kits (/eng /mkt /video /seo /ecom) cover 101 commands and 82,197 measured tokens with the cleaner architecture.
FAQ
What is the difference between a Claude Code agent and a skill?
An agent runs in its own isolated context window with its own system prompt — the parent only sees the output, not the reasoning. A skill loads into the main conversation's context and runs inline. Use an agent when the sub-task is large, needs to run in parallel with other work, or needs an adversarial perspective it did not write. Use a skill for a single reusable capability the model should apply automatically.
Can a slash command call both skills and agents?
Yes, and the best ones do. A command orchestrates the full workflow: it triggers skills (which load inline) and delegates to agents (which run in isolation) at the appropriate points. The command itself runs in the main context but keeps it clean by pushing heavy sub-work into agents. Skills and agents are not alternatives to commands — they are what commands call.
Do skills and agents share context with each other?
Skills share the main conversation's context — all skills load into the same window. Agents do not share context with each other or with the main conversation except through explicit handoffs (task in, result out). Two agents running in parallel have no visibility into each other's reasoning, which is exactly what makes parallel fan-out reliable.
Why are ClaudeKit's agents read-only in v2?
The v1 architecture used agents to produce and review deliverables in sequence. We found blocking reviewer gates added latency and token cost without reliable quality improvement. The v2 principle is that commands end with evidence — a report, a diff, a verified file — not an agent's verdict. Read-only agents that audit output they never wrote still provide useful adversarial signal, but they report findings rather than blocking progress.
How do I know how many tokens my skills and agents are consuming?
Run ck tokens <kit> to recount the current footprint of any installed kit. Each kit publishes a token ledger: EngineerKit is 20,413 tokens total, MarketingKit 16,714, SEOKit 16,004, EcomKit 16,464, VideoKit 12,602. These count commands, skills, and agent system prompts combined. Skills only load when triggered — you do not pay all 20,413 tokens in a session where only two skills fire. See the token cost measurement post for per-trigger breakdowns.
What happened to FounderKit and SalesKit?
Both are shelved. They were v1 kits built around the orchestrator/reviewer-gate architecture ClaudeKit v2 moved away from. The five shipped kits (EngineerKit, MarketingKit, VideoKit, SEOKit, EcomKit) replace them with the cleaner evidence-based pattern. If you were using FounderKit or SalesKit, MarketingKit's /mkt launch and /mkt calendar commands cover the most-used workflows.
If you are building with Claude Code and want all three primitives wired up correctly from day one, the fastest path is installing one of the five kits. EngineerKit covers the full dev workflow with 25 commands and read-only review agents. MarketingKit gives you voice-matched content generation with auto-loading skills. SEOKit, VideoKit, and EcomKit cover their respective domains. Install takes two commands: ck auth <key> then ck install <kit>. The token ledger prints on install so you know exactly what you are paying for before the first run.
Give Claude Code a real team
Five kits, 101 commands, every token measured. Pick the team that matches your work and install it in five minutes.
See the kits

