Clean up messy web pages into readable text

Strips nav, ads, footers, and cruft from any URL and returns just the article or doc as clean markdown — faster to parse and cheaper to analyze than raw HTML.

Best for: Reading docs, articles, and blog posts without paying for the junk.

Engineering / pipelines-dataatomicfor-engineerslight-setupfrom-url

Topics

claudeclawdbotclicodexdefuddleobsidianopenclawopencodeskills

Source

Creator's repository · kepano/obsidian-skills

View on GitHub

License: MIT

Skill file

Preview skill file
---
name: defuddle
description: Extract clean markdown content from web pages using Defuddle CLI, removing clutter and navigation to save tokens. Use instead of WebFetch when the user provides a URL to read or analyze, for online documentation, articles, blog posts, or any standard web page. Do NOT use for URLs ending in .md — those are already markdown, use WebFetch directly.
---

# Defuddle

Use Defuddle CLI to extract clean readable content from web pages. Prefer over WebFetch for standard web pages — it removes navigation, ads, and clutter, reducing token usage.

If not installed: `npm install -g defuddle`

## Usage

Always use `--md` for markdown output:

```bash
defuddle parse <url> --md
```

Save to file:

```bash
defuddle parse <url> --md -o content.md
```

Extract specific metadata:

```bash
defuddle parse <url> -p title
defuddle parse <url> -p description
defuddle parse <url> -p domain
```

## Output formats

| Flag | Format |
|------|--------|
| `--md` | Markdown (default choice) |
| `--json` | JSON with both HTML and markdown |
| (none) | HTML |
| `-p <name>` | Specific metadata property |