llm-sentry-geo-marketing

GEO (Generative Engine Optimization) monitoring platform for tracking brand mentions in AI search engines like DeepSeek and Doubao
Skill file

Preview skill file↓↑
---
name: llm-sentry-geo-marketing
description: GEO (Generative Engine Optimization) monitoring platform for tracking brand mentions in AI search engines like DeepSeek and Doubao
triggers:
  - monitor my brand in AI search results
  - track GEO performance across AI engines
  - analyze citations in generative search
  - set up AI search monitoring with LLM Sentry
  - measure brand share of voice in AI responses
  - automate GEO tracking and analysis
  - extract reference links from AI answers
  - compare brand visibility in DeepSeek and Doubao
---

# LLM Sentry GEO Marketing Skill

> Skill by [ara.so](https://ara.so) — Marketing Skills collection.

## What This Project Does

LLM Sentry is an automated monitoring and analysis platform for **GEO (Generative Engine Optimization)**. It tracks how AI search engines (DeepSeek, Doubao, Bocha) mention brands and analyze citation sources in their responses. The platform uses browser automation (Playwright) to simulate real user queries, extract AI responses, parse reference links, and calculate Share of Voice (SoV) metrics.

**Core capabilities:**
- Real-time brand exposure monitoring in AI search results
- Citation source extraction and domain analysis
- Multi-round query execution for stability testing
- Desktop client (Wails + Go + React) with local SQLite storage
- RESTful API for task automation
- PostgreSQL-based data persistence

## Project Structure

This is a monorepo with three main components:

```
GEO/
├── geo_db/                  # PostgreSQL database service
├── llm_sentry_monitor/      # Python monitoring service (Playwright-based)
└── geo_client2/             # Desktop client (Wails + Go + React)
```

## Installation

### Prerequisites

- Python 3.11+ with `uv` package manager
- Docker and Docker Compose
- PostgreSQL 15+ (via Docker)
- For desktop client: Go 1.21+, Node.js 18+, Wails v2

### Quick Setup with Makefile

```bash
# Clone the repository
git clone https://github.com/daijinma/geo_marketing.git
cd geo_marketing

# Initialize environment and install dependencies
make setup

# Start PostgreSQL database
make db

# Run monitoring service
make run
```

### Manual Setup

```bash
# Create virtual environment with uv
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install Python dependencies
cd llm_sentry_monitor
uv pip install -e ".[dev]"

# Install Playwright browsers
playwright install chromium

# Start database
cd ../geo_db
docker-compose up -d
```

### Desktop Client Setup

```bash
cd geo_client2

# Install frontend dependencies
cd frontend
pnpm install

# Install Go dependencies
cd ..
go mod tidy

# Run in development mode
wails dev

# Build for production
wails build
```

## Database Schema

The system uses two main tables:

**Records table** (monitoring results):
```sql
CREATE TABLE records (
    id SERIAL PRIMARY KEY,
    query VARCHAR(500) NOT NULL,
    provider VARCHAR(50) NOT NULL,
    response_text TEXT,
    raw_citations JSONB,
    created_at TIMESTAMP DEFAULT NOW()
);
```

**Citations table** (extracted references):
```sql
CREATE TABLE citations (
    id SERIAL PRIMARY KEY,
    record_id INTEGER REFERENCES records(id),
    url TEXT NOT NULL,
    domain VARCHAR(255),
    title VARCHAR(500),
    sequence INTEGER,
    created_at TIMESTAMP DEFAULT NOW()
);
```

## API Usage

### Start API Server

```python
# llm_sentry_monitor/main.py
from fastapi import FastAPI
from api.routes import router

app = FastAPI(title="LLM Sentry API")
app.include_router(router)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)
```

Start server:
```bash
cd llm_sentry_monitor
python main.py
```

### Create Monitoring Task

**POST** `/mock`

```bash
curl -X POST http://localhost:8000/mock \
  -H "Content-Type: application/json" \
  -d '{
    "keywords": ["AI search engines", "generative search optimization"],
    "platforms": ["deepseek", "doubao"],
    "query_count": 3,
    "settings": {
      "headless": false,
      "timeout": 60000,
      "delay_between_tasks": 5
    }
  }'
```

Python example:

```python
import requests

payload = {
    "keywords": ["AI搜索引擎对比", "DeepSeek vs ChatGPT"],
    "platforms": ["deepseek", "doubao"],
    "query_count": 2,
    "settings": {
        "headless": True,
        "timeout": 60000,
        "delay_between_tasks": 3
    }
}

response = requests.post("http://localhost:8000/mock", json=payload)
results = response.json()

for result in results:
    print(f"Query: {result['query']}")
    print(f"Platform: {result['provider']}")
    print(f"Citations: {len(result['citations'])}")
```

## Provider Implementation Pattern

Each AI platform has a dedicated provider class:

```python
# llm_sentry_monitor/providers/base.py
from abc import ABC, abstractmethod
from playwright.async_api import Page

class BaseProvider(ABC):
    def __init__(self, page: Page):
        self.page = page
    
    @abstractmethod
    async def search(self, query: str) -> dict:
        """Execute search and return response + citations"""
        pass
    
    @abstractmethod
    async def login(self) -> bool:
        """Handle authentication if needed"""
        pass
```

Example DeepSeek provider:

```python
# llm_sentry_monitor/providers/deepseek.py
from .base import BaseProvider
import asyncio

class DeepSeekProvider(BaseProvider):
    BASE_URL = "https://chat.deepseek.com"
    
    async def search(self, query: str) -> dict:
        await self.page.goto(self.BASE_URL)
        
        # Enable search mode
        search_toggle = await self.page.wait_for_selector('[data-testid="search-toggle"]')
        await search_toggle.click()
        
        # Enter query
        input_box = await self.page.wait_for_selector('textarea[placeholder*="Ask"]')
        await input_box.fill(query)
        await input_box.press("Enter")
        
        # Wait for response
        await asyncio.sleep(5)
        response_text = await self.page.text_content('.response-content')
        
        # Extract citations
        citations = await self.page.locator('.citation-link').all()
        citation_data = []
        for idx, citation in enumerate(citations, 1):
            url = await citation.get_attribute('href')
            title = await citation.text_content()
            citation_data.append({
                "sequence": idx,
                "url": url,
                "title": title
            })
        
        return {
            "query": query,
            "provider": "deepseek",
            "response_text": response_text,
            "citations": citation_data
        }
```

## Core Domain Extraction Logic

```python
# llm_sentry_monitor/core/parser.py
import tldextract
from urllib.parse import urlparse

def extract_domain(url: str) -> str:
    """Extract clean domain from URL"""
    parsed = tldextract.extract(url)
    domain = f"{parsed.domain}.{parsed.suffix}"
    return domain

def calculate_sov(citations: list[dict]) -> dict:
    """Calculate Share of Voice by domain"""
    domain_counts = {}
    total = len(citations)
    
    for citation in citations:
        domain = extract_domain(citation['url'])
        domain_counts[domain] = domain_counts.get(domain, 0) + 1
    
    sov = {
        domain: {
            "count": count,
            "percentage": round(count / total * 100, 2)
        }
        for domain, count in domain_counts.items()
    }
    
    return sov
```

## Database Operations

```python
# llm_sentry_monitor/core/database.py
import psycopg2
import os
import json

class Database:
    def __init__(self):
        self.conn = psycopg2.connect(
            host=os.getenv("DB_HOST", "localhost"),
            port=os.getenv("DB_PORT", "5432"),
            user=os.getenv("DB_USER", "geo_user"),
            password=os.getenv("DB_PASSWORD"),
            database=os.getenv("DB_NAME", "geo_db")
        )
    
    def save_record(self, query: str, provider: str, response_text: str, citations: list) -> int:
        """Save monitoring record and return record_id"""
        with self.conn.cursor() as cur:
            cur.execute(
                """
                INSERT INTO records (query, provider, response_text, raw_citations)
                VALUES (%s, %s, %s, %s)
                RETURNING id
                """,
                (query, provider, response_text, json.dumps(citations))
            )
            record_id = cur.fetchone()[0]
            self.conn.commit()
            return record_id
    
    def save_citations(self, record_id: int, citations: list):
        """Save individual citations linked to record"""
        with self.conn.cursor() as cur:
            for citation in citations:
                domain = extract_domain(citation['url'])
                cur.execute(
                    """
                    INSERT INTO citations (record_id, url, domain, title, sequence)
                    VALUES (%s, %s, %s, %s, %s)
                    """,
                    (record_id, citation['url'], domain, citation.get('title'), citation['sequence'])
                )
            self.conn.commit()
```

## Running Monitoring Tasks

### Single Query Example

```python
from playwright.async_api import async_playwright
from providers.deepseek import DeepSeekProvider
from core.database import Database

async def run_monitoring():
    db = Database()
    
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        page = await browser.new_page()
        
        provider = DeepSeekProvider(page)
        
        result = await provider.search("最好的AI搜索引擎推荐")
        
        # Save to database
        record_id = db.save_record(
            query=result['query'],
            provider=result['provider'],
            response_text=result['response_text'],
            citations=result['citations']
        )
        db.save_citations(record_id, result['citations'])
        
        await browser.close()
        print(f"Saved record #{record_id} with {len(result['citations'])} citations")

# Run
import asyncio
asyncio.run(run_monitoring())
```

### Multi-Round Query Example

```python
async def run_multi_round_monitoring(keywords: list, platforms: list, rounds: int):
    db = Database()
    
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        
        for keyword in keywords:
            for platform in platforms:
                page = await browser.new_page()
                
                # Get provider class
                if platform == "deepseek":
                    provider = DeepSeekProvider(page)
                elif platform == "doubao":
                    provider = DoubaoProvider(page)
                
                for round_num in range(1, rounds + 1):
                    print(f"Round {round_num}/{rounds}: {keyword} on {platform}")
                    
                    result = await provider.search(keyword)
                    record_id = db.save_record(
                        query=result['query'],
                        provider=result['provider'],
                        response_text=result['response_text'],
                        citations=result['citations']
                    )
                    db.save_citations(record_id, result['citations'])
                    
                    # Delay between rounds
                    await asyncio.sleep(5)
                
                await page.close()
        
        await browser.close()
```

## Desktop Client (Wails) Backend

```go
// geo_client2/backend/app.go
package backend

import (
    "context"
    "database/sql"
    _ "github.com/mattn/go-sqlite3"
)

type App struct {
    ctx context.Context
    db  *sql.DB
}

func NewApp() *App {
    return &App{}
}

func (a *App) startup(ctx context.Context) {
    a.ctx = ctx
    
    // Initialize SQLite
    db, err := sql.Open("sqlite3", "./geo_sentry.db")
    if err != nil {
        panic(err)
    }
    a.db = db
    
    // Create tables
    a.initDatabase()
}

func (a *App) CreateTask(keywords []string, platforms []string, queryCount int) (string, error) {
    // Insert task into database
    tx, _ := a.db.Begin()
    result, err := tx.Exec(`
        INSERT INTO tasks (keywords, platforms, query_count, status)
        VALUES (?, ?, ?, 'pending')
    `, strings.Join(keywords, ","), strings.Join(platforms, ","), queryCount)
    
    if err != nil {
        tx.Rollback()
        return "", err
    }
    
    tx.Commit()
    taskID, _ := result.LastInsertId()
    
    // Start monitoring in background
    go a.executeTask(taskID)
    
    return fmt.Sprintf("Task %d created", taskID), nil
}
```

## Configuration

### Environment Variables

Create `.env` file in `llm_sentry_monitor/`:

```bash
# Database
DB_HOST=localhost
DB_PORT=5432
DB_USER=geo_user
DB_PASSWORD=your_secure_password
DB_NAME=geo_db

# Playwright
HEADLESS_MODE=false
BROWSER_TIMEOUT=60000

# API
API_PORT=8000
DELAY_BETWEEN_TASKS=5
```

### Settings File

```python
# llm_sentry_monitor/config/settings.py
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    db_host: str = "localhost"
    db_port: int = 5432
    db_user: str = "geo_user"
    db_password: str
    db_name: str = "geo_db"
    
    headless_mode: bool = False
    browser_timeout: int = 60000
    delay_between_tasks: int = 5
    
    class Config:
        env_file = ".env"

settings = Settings()
```

## Common Patterns

### Pattern 1: Batch Keyword Monitoring

```python
async def monitor_keyword_matrix(brands: list, question_templates: list):
    """Monitor multiple brands across question patterns"""
    keywords = []
    
    for brand in brands:
        for template in question_templates:
            keywords.append(template.format(brand=brand))
    
    await run_multi_round_monitoring(
        keywords=keywords,
        platforms=["deepseek", "doubao"],
        rounds=3
    )
```

### Pattern 2: Citation Analysis Pipeline

```python
def analyze_citations_by_brand(brand: str, days: int = 7):
    """Analyze citation sources for a brand over time"""
    db = Database()
    
    with db.conn.cursor() as cur:
        cur.execute("""
            SELECT c.domain, COUNT(*) as mentions
            FROM citations c
            JOIN records r ON c.record_id = r.id
            WHERE r.query LIKE %s
            AND r.created_at >= NOW() - INTERVAL '%s days'
            GROUP BY c.domain
            ORDER BY mentions DESC
        """, (f"%{brand}%", days))
        
        results = cur.fetchall()
        
    return [{"domain": row[0], "mentions": row[1]} for row in results]
```

### Pattern 3: Share of Voice Calculation

```python
def calculate_brand_sov(brands: list[str], query: str):
    """Calculate Share of Voice for multiple brands"""
    db = Database()
    
    with db.conn.cursor() as cur:
        cur.execute("""
            SELECT r.response_text
            FROM records r
            WHERE r.query = %s
            ORDER BY r.created_at DESC
            LIMIT 1
        """, (query,))
        
        response = cur.fetchone()[0]
    
    sov = {}
    for brand in brands:
        mentions = response.lower().count(brand.lower())
        sov[brand] = mentions
    
    total = sum(sov.values())
    return {
        brand: {
            "mentions": count,
            "percentage": round(count / total * 100, 2) if total > 0 else 0
        }
        for brand, count in sov.items()
    }
```

## Troubleshooting

### Issue: Playwright Browser Not Found

```bash
# Install browsers explicitly
playwright install chromium

# Or install all browsers
playwright install
```

### Issue: Database Connection Failed

```bash
# Check database is running
docker ps | grep postgres

# Restart database
cd geo_db
docker-compose restart

# Check connection
psql -h localhost -U geo_user -d geo_db
```

### Issue: Login Required During Headless Mode

First run in non-headless mode to complete login:

```python
# In settings or API call
"settings": {
    "headless": false
}
```

After successful login, browser will save cookies. Then enable headless mode.

### Issue: Citations Not Extracted

Check selector patterns in provider code:

```python
# Debug citation extraction
citations = await page.locator('.citation-link').all()
print(f"Found {len(citations)} citations")

# Try alternative selectors
citations = await page.locator('a[data-citation]').all()
```

### Issue: Rate Limiting or Blocking

Add delays between requests:

```python
# In settings
"settings": {
    "delay_between_tasks": 10  # Increase to 10 seconds
}

# Or add random delays
import random
await asyncio.sleep(random.uniform(5, 15))
```

## Makefile Commands

```bash
make setup    # Initialize environment with uv
make db       # Start PostgreSQL database
make run      # Run monitoring service
make test     # Run tests
make clean    # Stop database and clean cache
make help     # Show all available commands
```

## Key Metrics to Track

1. **Brand Mention Rate**: Percentage of queries where brand appears
2. **Citation Count**: Number of reference links per query
3. **Domain Distribution**: Top domains cited by AI
4. **Position Analysis**: Average position of brand mentions
5. **Share of Voice**: Brand percentage vs. competitors
6. **Stability Score**: Consistency across multi-round queries
Source

Creator's repository · aradotso/marketing-skills
View on GitHub ↗
Security

Security checks in progress
Results will appear here once audits complete
What this skill can do
Reads your filesConnects to the internetRuns code on your machine
Checked by 3 independent security firms
Does it try to trick the AI?Not yet checkedPending · Gen Agent Trust Hub
Does it sneak in hidden code?Not yet checkedPending · Socket
Does it have known bugs?Not yet checkedPending · Snyk