For Publishers

Stop training someone else's AI for free.

22+ AI crawlers identified out of the box. Block them, throttle them, or charge per crawl. Either way, you control what your content trains.

Content Protection

Block AI training crawlers from scraping your articles and content.

AI Crawler Blocking

GPTBot, ClaudeBot, CCBot, Bytespider. all detected and blockable.

robots.txt Generator

Generate opt-out rules for all major AI providers in one click.

Content Watermarking

Invisible markers trace scraped content back to the source.

Compliance Monitor

Track which crawlers respect your robots.txt and which don't.

Traffic Transparency

See exactly how much of your traffic is AI bots vs real readers.

TDM Headers

EU Text & Data Mining Reservation Protocol compliance.

Cost Calculator

Estimate bandwidth costs from unwanted bot traffic.

53%
Web traffic is now bots
22+
AI crawlers identified
0
Compensation by default
100%
Per-bot visibility
Workflow

Three minutes from script tag to control.

No DNS changes, no proxy install, no platform lock-in. One script. Works on WordPress, Ghost, Substack, custom CMS, anything that renders HTML.

Step 01

See who's reading you

Real-time dashboard breaks down every AI crawler hitting your site — GPTBot, ClaudeBot, PerplexityBot, Bytespider, and the long tail of sub-million-page-a-month scrapers.

Step 02

Choose your policy

Block specific crawlers. Allow only those that respect robots.txt. Throttle aggressive ones. Or charge per crawl with the pay-per-crawl module.

Step 03

Enforce + measure

Per-crawler verdict logs. Track blocks, allows, and revenue if monetizing. Audit-friendly trail for compliance and legal teams.

Threats

Who's reading you without permission.

The visible AI crawlers are just the start. Most scraping for AI training happens via headless Chromium fleets paid for by data brokers.

01

AI training scrapers

OpenAI's GPTBot, Anthropic's ClaudeBot, Google's GoogleOther, ByteDance's Bytespider — all crawl for model training corpora. Identified by user-agent, verified by rDNS, blocked at the edge.

02

Search-and-summarize

PerplexityBot, You.com, Phind, Bing Copilot grab your content to generate answers — keeping users on their interface, away from your ads. Detected and blockable per crawler.

03

Aggregator scrapers

Headless Chromium fleets harvest your articles for downstream resale, summary feeds, and SEO clone sites. Caught via fingerprint mismatch and behavioral signals, even with rotation.

04

Anti-detect harvesters

Browserbase, ScrapingBee Premium, ZenRows, Hyperbrowser — paid services optimized to evade traditional detection. Caught via canvas double-render and timing-distribution ML.

Worked example

Blocking Bytespider before it touches an article.

ByteDance's crawler arrives. We identify it by user-agent. We verify with reverse DNS. We block per your site policy. We log it for your records.

GET /article/the-future-of-ai · 95.142.121.12 · Bytespider
  • Identityuser-agent matches 'Bytespider'0.95
  • VerificationrDNS confirms .bytespider.com originverified
  • Policysite rule: block all known AI crawlersmatch
  • Loggingrequest logged for compliance auditlogged
  • VerdictCapped at 1.0 — blockedBLOCK
What's in the box

Built for publishers, by people who get the stakes.

Crawler dashboard

Per-crawler view: requests, bytes, top paths, peak hours, and revenue (if monetizing). Filter by AI vendor, last 30 days.

Block + allow policies

One toggle per known crawler. Granular path scoping — block ChatGPT from /premium/* but allow from /free/*.

Content opt-out flag

Honors AI-opt-out headers (noai, noimageai, X-Robots-Tag) and emits them on every response. Respectful bots back off automatically.

Pay-per-crawl

Monetize crawler traffic. Set a per-request price in your currency. Crawlers either pay or get blocked. Revenue reports in the dashboard.

Content watermarks

Invisible zero-width markers embedded in your text. If your content surfaces in an AI dataset later, the marker proves where it came from.

Forensic logs

Per-request audit trail — full headers, score, signals, geo. Export for legal action or compliance reporting.

53% of internet traffic is automated.
How much of yours?

Most site owners have no idea. Find out in under 2 minutes — free.