metaclaw-evolving-agent

Deploy and configure MetaClaw — an agent that meta-learns and evolves from live conversations using skills injection, RL training, and smart scheduling.

Skill file

Preview skill file↓↑

---
name: metaclaw-evolving-agent
description: Deploy and configure MetaClaw — an agent that meta-learns and evolves from live conversations using skills injection, RL training, and smart scheduling.
triggers:
  - set up metaclaw agent
  - configure evolving agent
  - metaclaw skills mode
  - metaclaw rl training
  - metaclaw madmax scheduler
  - agent meta-learning setup
  - tinker rl backend configuration
  - metaclaw proxy deployment
---

# MetaClaw Evolving Agent

> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection

MetaClaw is an OpenAI-compatible proxy agent that intercepts conversations, injects learned skills, and continuously improves itself through real-world interactions. It supports three modes: lightweight skills injection, immediate RL training, and a smart "madmax" scheduler that defers weight updates to idle/sleep windows.

---

## Installation

```bash
# Minimal — skills injection only, no GPU required
pip install -e .

# Full RL training support (torch, transformers, tinker)
pip install -e ".[rl]"

# Skill evolution via LLM summarization
pip install -e ".[evolve]"

# Google Calendar scheduler for madmax mode
pip install -e ".[scheduler]"

# Recommended: everything
pip install -e ".[rl,evolve,scheduler]"
```

---

## Quick Start

```bash
# One-time interactive config wizard
metaclaw setup

# Start in default madmax mode (skills + RL + smart scheduler)
metaclaw start

# Skills only — no GPU, no Tinker needed
metaclaw start --mode skills_only

# RL mode — trains immediately when batch is full
metaclaw start --mode rl

# RL without scheduler (same as above, explicit)
metaclaw start --mode rl
```

After `metaclaw start`, a local OpenAI-compatible proxy is running. Point your client (OpenClaw or any OpenAI SDK consumer) at `http://localhost:<port>` instead of the upstream LLM endpoint.

---

## Configuration

`metaclaw setup` writes a config file (default: `~/.metaclaw/config.yaml`). You can also edit it directly:

```yaml
# ~/.metaclaw/config.yaml

proxy:
  host: 0.0.0.0
  port: 8080

llm:
  provider: kimi          # kimi | qwen | claude | minimax | openai | gemini
  base_url: https://api.moonshot.cn/v1
  model: moonshot-v1-8k
  # api_key loaded from env: METACLAW_LLM_API_KEY

skills:
  enabled: true
  max_injected: 5         # max skills injected per turn
  summarize_after_session: true

rl:
  enabled: true
  backend: auto           # auto | tinker | mint
  batch_size: 32
  algorithm: grpo
  opd_teacher: false      # optional teacher distillation

scheduler:                # madmax mode only
  enabled: true
  sleep_hours: [22, 7]    # local 22:00–07:00
  idle_timeout_minutes: 15
  google_calendar: false  # set true + configure OAuth for meeting detection

logging:
  level: info
  log_dir: ~/.metaclaw/logs
```

### Environment Variables

```bash
export METACLAW_LLM_API_KEY="your-llm-api-key"
export METACLAW_TINKER_API_KEY="your-tinker-api-key"   # rl mode
export METACLAW_MINT_API_KEY="your-mint-api-key"        # if backend=mint
export GOOGLE_CALENDAR_CREDENTIALS_PATH="path/to/creds.json"  # scheduler
```

---

## Operating Modes

| Mode | Command | GPU Required | Description |
|------|---------|--------------|-------------|
| `skills_only` | `metaclaw start --mode skills_only` | No | Proxy + skills injection + auto-summarization |
| `rl` | `metaclaw start --mode rl` | Via API | Skills + GRPO training when batch fills |
| `madmax` | `metaclaw start` | Via API | Skills + RL + scheduler (trains only during idle/sleep/meetings) |

---

## Python API

### Programmatic startup

```python
import asyncio
from metaclaw import MetaClawAgent, AgentConfig, Mode

async def main():
    config = AgentConfig.from_yaml("~/.metaclaw/config.yaml")
    agent = MetaClawAgent(config, mode=Mode.MADMAX)
    await agent.start()

asyncio.run(main())
```

### Manual skill injection

```python
from metaclaw.skills import SkillStore, SkillInjector

store = SkillStore(path="~/.metaclaw/skills")

# Add a skill manually
store.add(
    name="code-review-checklist",
    content="Always check for: 1) error handling, 2) type hints, 3) docstrings.",
    tags=["code", "review"]
)

# Retrieve top-k relevant skills for a query
injector = SkillInjector(store)
relevant = injector.retrieve(query="review my Python function", top_k=3)
for skill in relevant:
    print(skill.name, skill.score)
```

### Intercepting and recording conversations

```python
from metaclaw.proxy import ConversationInterceptor
from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(max_size=1000)

interceptor = ConversationInterceptor(
    upstream_url="https://api.moonshot.cn/v1",
    on_complete=buffer.record   # called after each turn with (messages, response)
)

# buffer.record signature:
async def on_complete(messages: list[dict], response: dict) -> None:
    ...
```

### Triggering RL training manually

```python
from metaclaw.training import RLTrainer, TrainingConfig

trainer = RLTrainer(
    config=TrainingConfig(
        backend="tinker",       # or "mint"
        algorithm="grpo",
        batch_size=32,
        lora_rank=16,
    )
)

# Collect a batch from the experience buffer and train
async def run_training(buffer):
    batch = buffer.sample(n=32, split="support")   # support/query separation
    result = await trainer.train(batch)
    print(f"Training complete. Loss: {result.loss:.4f}, Steps: {result.steps}")
```

### Reward modeling

```python
from metaclaw.rewards import RewardModel

reward_model = RewardModel(provider="llm")  # uses configured LLM for scoring

async def score_turn(prompt: str, response: str) -> float:
    score = await reward_model.score(prompt=prompt, response=response)
    return score  # float in [-1.0, 1.0]
```

---

## Skills Lifecycle

```
Conversation turn
       │
       ▼
 SkillInjector.retrieve()   ← vector search over SkillStore
       │  injects top-k skills into system prompt
       ▼
 LLM responds
       │
       ▼
 ExperienceBuffer.record()  ← stores (context, response, metadata)
       │
       ▼ (end of session)
 SkillSummarizer.run()      ← LLM extracts reusable patterns
       │
       ▼
 SkillStore.upsert()        ← new/updated skills persisted to disk
```

---

## Integration: OpenAI SDK as Client

Point any OpenAI SDK client at the MetaClaw proxy:

```python
from openai import OpenAI

# MetaClaw proxy is running on localhost:8080
client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="not-used-but-required-by-sdk"
)

response = client.chat.completions.create(
    model="moonshot-v1-8k",   # passed through to upstream
    messages=[
        {"role": "user", "content": "Review my pull request strategy."}
    ]
)
print(response.choices[0].message.content)
```

Skills are injected transparently — the client code does not change.

---

## Scheduler (MadMax Mode)

The scheduler ensures RL weight updates never interrupt active use:

```python
from metaclaw.scheduler import MadMaxScheduler, SchedulerConfig

scheduler = MadMaxScheduler(
    config=SchedulerConfig(
        sleep_hours=(22, 7),          # train between 22:00–07:00 local time
        idle_timeout_minutes=15,      # train after 15 min of no conversations
        google_calendar=True,         # also train during calendar meetings
        credentials_path="creds.json"
    )
)

# Check if it's safe to train right now
if await scheduler.is_training_window():
    await trainer.train(batch)
```

### Google Calendar Setup

```bash
# 1. Enable Google Calendar API in Google Cloud Console
# 2. Download OAuth2 credentials as creds.json
# 3. Set path in config or env
export GOOGLE_CALENDAR_CREDENTIALS_PATH="/path/to/creds.json"

# 4. First run will open browser for OAuth consent
metaclaw start
```

---

## Support/Query Set Separation

MetaClaw separates experience into support and query sets to prevent stale rewards from polluting updates:

```python
from metaclaw.memory import ExperienceBuffer

buffer = ExperienceBuffer(
    max_size=2000,
    support_ratio=0.5   # 50% support, 50% query
)

# During training:
support_batch = buffer.sample(n=16, split="support")  # used to compute reward signal
query_batch   = buffer.sample(n=16, split="query")    # used for gradient update

await trainer.train_meta(support=support_batch, query=query_batch)
```

---

## RL Backends

### Tinker (default)

```yaml
rl:
  backend: tinker
  tinker_project: my-metaclaw-project
  lora_rank: 16
  learning_rate: 1e-4
```

### MinT

```bash
# Install MinT compatibility layer separately
pip install metaclaw-mint
```

```yaml
rl:
  backend: mint
  mint_endpoint: https://your-mint-endpoint
```

### Auto-detection

```yaml
rl:
  backend: auto   # tries tinker first, falls back to mint, errors if neither available
```

---

## Troubleshooting

**Proxy not reachable after `metaclaw start`**
- Check port conflicts: `lsof -i :8080`
- Change `proxy.port` in config and restart

**`rl` mode: "No training backend available"**
- Ensure `pip install -e ".[rl]"` completed successfully
- Verify `METACLAW_TINKER_API_KEY` or `METACLAW_MINT_API_KEY` is set
- Try `rl.backend: tinker` explicitly instead of `auto`

**Skills not persisting between sessions**
- Confirm `skills.summarize_after_session: true` in config
- Check write permissions on `~/.metaclaw/skills/`
- Run `metaclaw skills list` to inspect stored skills

**Madmax mode never trains**
- Verify `scheduler.sleep_hours` covers your timezone's night
- Lower `scheduler.idle_timeout_minutes` for testing (e.g., `1`)
- Check scheduler logs: `~/.metaclaw/logs/scheduler.log`

**Google Calendar integration fails**
- Re-run OAuth flow: delete `~/.metaclaw/token.json` and restart
- Ensure Calendar API is enabled in your Google Cloud project

**OPD teacher distillation errors**
- Only supported with `rl.backend: tinker`
- Requires a separate teacher model endpoint in config:
  ```yaml
  rl:
    opd_teacher: true
    teacher_base_url: https://api.openai.com/v1
    teacher_model: gpt-4o
  ```

---

## CLI Reference

```bash
metaclaw setup                   # interactive config wizard
metaclaw start                   # start in madmax mode
metaclaw start --mode skills_only
metaclaw start --mode rl
metaclaw start --config path/to/config.yaml

metaclaw skills list             # show all stored skills
metaclaw skills delete <name>    # remove a skill
metaclaw skills export skills.json

metaclaw status                  # show proxy, scheduler, training status
metaclaw logs                    # tail all logs
metaclaw logs --component scheduler
```

Source

Creator's repository · aradotso/trending-skills

View on GitHub ↗

Security

Security checks in progress

Results will appear here once audits complete

What this skill can do

Reads your filesConnects to the internetRuns code on your machine

Checked by 3 independent security firms

Does it try to trick the AI?Not yet checkedPending · Gen Agent Trust Hub

Does it sneak in hidden code?Not yet checkedPending · Socket

Does it have known bugs?Not yet checkedPending · Snyk