web-research

|

Skill file

Preview skill file
---
name: web-research
description: |
  Neural web search and content extraction using x402-protected APIs. Better than WebSearch for deep research and WebFetch for blocked sites.

  USE FOR:
  - Deep web research and investigation
  - Finding similar pages to a reference URL
  - Extracting clean text from web pages
  - Scraping sites that block standard fetchers
  - Getting direct answers to factual questions
  - Research requiring multiple sources
  - Crawling multiple pages from a website

  TRIGGERS:
  - "research", "investigate", "deep dive", "find sources"
  - "similar to", "pages like", "more like this"
  - "scrape", "extract content from", "get the text from"
  - "blocked site", "can't access", "paywall"
  - "what is", "explain", "answer this"
  - "crawl", "crawl site", "scrape entire site"

  Use `npx agentcash@latest fetch` for stableenrich.dev endpoints. Prefer Exa for semantic/neural search, Firecrawl for direct scraping.
metadata:
  version: 2
---

# Web Research with x402 APIs

Access Exa (neural search) and Firecrawl (web scraping) through x402-protected endpoints.

## Setup

See [rules/getting-started.md](rules/getting-started.md) for installation and wallet setup.

## Quick Reference

| Task | Endpoint | Price | Best For |
|------|----------|-------|----------|
| Neural search | `https://stableenrich.dev/api/exa/search` | $0.01 | Semantic web search |
| Find similar | `https://stableenrich.dev/api/exa/find-similar` | $0.01 | Pages similar to a URL |
| Extract text | `https://stableenrich.dev/api/exa/contents` | $0.002 | Clean text from URLs |
| Direct answers | `https://stableenrich.dev/api/exa/answer` | $0.01 | Factual Q&A |
| Scrape page | `https://stableenrich.dev/api/firecrawl/scrape` | $0.0126 | Single page to markdown |
| Web search | `https://stableenrich.dev/api/firecrawl/search` | $0.0252 | Search with scraping |
| Crawl website | `https://stableenrich.dev/api/cloudflare/crawl` | $0.10 | Multi-page site crawl |
| Poll crawl | `GET https://stableenrich.dev/api/cloudflare/jobs?token=...` | Free | Poll crawl results |

## When to Use What

| Scenario | Tool |
|----------|------|
| General web search | WebSearch (free) or Exa ($0.01) |
| Semantic/conceptual search | Exa search |
| Find pages like X | Exa find-similar |
| Get clean text from URL | Exa contents |
| Scrape blocked/JS-heavy site | Firecrawl scrape |
| Search + scrape results | Firecrawl search |
| Quick fact lookup | Exa answer |
| Crawl entire site/section | Cloudflare crawl |

See [rules/when-to-use.md](rules/when-to-use.md) for detailed guidance.

## Exa Neural Search

Semantic search that understands meaning, not just keywords:

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/exa/search -m POST -b '{
  "query": "startups building AI agents for customer support",
  "numResults": 10,
  "type": "neural"
}'
```

**Options:**
- `query` - Search query (required)
- `numResults` - Number of results (default: 10, max: 25)
- `type` - "neural" (semantic) or "keyword" (traditional)
- `includeDomains` - Only search these domains
- `excludeDomains` - Skip these domains
- `startPublishedDate` / `endPublishedDate` - Date range filter
- `category` - Filter by content type: "company", "research paper", "news", "pdf", "github", "tweet", "personal site", "linkedin profile", "financial report"
- Tip: Use `category: "linkedin profile"` for people/profile discovery

**Returns**: List of URLs with titles, snippets, and relevance scores.

## Find Similar Pages

Find pages semantically similar to a reference URL:

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/exa/find-similar -m POST -b '{
  "url": "https://example.com/article-i-like",
  "numResults": 10
}'
```

Great for:
- Finding competitor products
- Discovering related content
- Expanding research sources

## Extract Text Content

Get clean, structured text from URLs:

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/exa/contents -m POST -b '{
  "urls": [
    "https://example.com/article1",
    "https://example.com/article2"
  ]
}'
```

**Options:**
- `urls` - Array of URLs to extract
- `text` - Include full text (default: true)
- `highlights` - Include key highlights

Cheapest option ($0.002) when you already have URLs and just need the content.

## Direct Answers

Get factual answers to questions:

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/exa/answer -m POST -b '{"query": "What is the population of Tokyo?"}'
```

Returns a direct answer with source citations. Best for:
- Factual questions
- Quick lookups
- Verification of claims

## Firecrawl Scrape

Scrape a single page to clean markdown:

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/firecrawl/scrape -m POST -b '{"url": "https://example.com/page-to-scrape"}'
```

**Options:**
- `url` - Page to scrape (required)
- `formats` - Output formats: ["markdown", "html", "links"]
- `onlyMainContent` - Skip nav/footer/ads (default: true)
- `waitFor` - Wait ms for JS to render

**Advantages over WebFetch:**
- Handles JavaScript-rendered content
- Bypasses common blocking
- Extracts main content only
- LLM-optimized markdown output

## Firecrawl Search

Web search with automatic scraping of results:

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/firecrawl/search -m POST -b '{
  "query": "best practices for react server components",
  "limit": 5
}'
```

**Options:**
- `query` - Search query (required)
- `limit` - Number of results (default: 5)
- `scrapeOptions` - Options passed to scraper

Returns search results with full scraped content for each.

## Cloudflare Website Crawl

Crawl multiple pages from a website with browser rendering. Async two-step pattern.

**Step 1: Start the crawl (paid, $0.10)**

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/cloudflare/crawl -m POST -b '{
  "url": "https://example.com",
  "limit": 10,
  "depth": 1,
  "formats": ["markdown"]
}'
```

Returns 202 with `{"token": "jwt..."}`.

**Step 2: Poll for results (SIWX, free)**

```bash
npx agentcash@latest fetch "https://stableenrich.dev/api/cloudflare/jobs?token=JWT_TOKEN"
```

Poll every 3-5 seconds until complete.

**Parameters:**
- `url` (required) — starting URL
- `limit` (default 10, max 25) — max pages
- `depth` (default 1, max 3) — max link depth
- `formats` — `["markdown", "html", "json"]`
- `render` (default false) — execute JavaScript
- `options.includePatterns` / `excludePatterns` — URL wildcards

Good for: crawling docs sites, scraping multiple pages, building sitemaps.

## Workflows

### Deep Research

- [ ] (Optional) Check balance: `npx agentcash@latest balance`
- [ ] Search broadly with Exa
- [ ] Find related sources with find-similar
- [ ] Extract content from top sources
- [ ] Synthesize findings

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/exa/search -m POST -b '{"query": "AI agents in healthcare 2024", "numResults": 15}'
```

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/exa/find-similar -m POST -b '{"url": "https://best-article-found.com"}'
```

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/exa/contents -m POST -b '{"urls": ["url1", "url2", "url3"]}'
```

### Blocked Site Scraping

- [ ] Try WebFetch first (free)
- [ ] If blocked/empty, use Firecrawl with `waitFor` for JS-heavy sites

```bash
npx agentcash@latest fetch https://stableenrich.dev/api/firecrawl/scrape -m POST -b '{"url": "https://blocked-site.com/article", "waitFor": 3000}'
```

## Cost Optimization

- **Use Exa contents** ($0.002) when you already have URLs
- **Use WebSearch/WebFetch first** (free) and fall back to x402 endpoints
- **Batch URL extraction** - pass multiple URLs to Exa contents
- **Limit results** - request only as many as needed

Source

Creator's repository · merit-systems/agentcash-skills

View on GitHub

Security

Security checks in progress
Results will appear here once audits complete
What this skill can do
Reads your filesConnects to the internetRuns code on your machine
Checked by 3 independent security firms
Does it try to trick the AI?Not yet checkedPending · Gen Agent Trust Hub
Does it sneak in hidden code?Not yet checkedPending · Socket
Does it have known bugs?Not yet checkedPending · Snyk