Every request, scored in milliseconds.
Three layers, evaluated in parallel — server, network, and client. The aggregate score ranges from 0.0 (human) to 1.0 (bot).
Server signals
User-agent, header order, TLS fingerprint, HTTP/2 priorities, ASN, IP classification. ~10ms.
Client beacons
detect.js sends 45+ signals in 3 beacons (200ms / 30s / unload). Canvas, WebGL, mouse entropy, font enumeration.
ML aggregation
6 ML engines combine signals into final score. Cached so repeat visitors verify in <5ms.
How a python-requests scraper gets caught.
Every signal stacks. By the time we hit a threshold, we've evaluated 22 rules across 4 layers. Here's a real request, scored.
- Identityuser-agent: python-requests/2.31.0+0.90
- Client signalsno asset requests in 600ms+0.25
- Browser fingerprintheader order matches requests lib+0.70
- IP intelligenceASN 14618 — Amazon AWS+0.30
- Client signalsno client-side beacon ever sent+0.25
- Final scoreCapped at 1.0 — blocked1.00
22 rules. Four detection layers.
Every signal we evaluate. Some give hard verdicts, others stack into the final score.
Matches GPTBot, ClaudeBot, Bytespider, PerplexityBot + 18 others by user-agent
python-requests, curl, wget, Scrapy, Puppeteer, Playwright, Selenium signatures
Missing or < 10 char user-agent — almost always a script
Reverse DNS confirms claimed crawler identity (Googlebot → google.com)
Headless browser automation flag exposed by every WebDriver-based tool
HTTP header order reveals python-requests, curl, Go-http-client, Node fetch
Fewer than 4 HTTP headers — real browsers send 8+
Claims Chrome but missing Client Hints — automation signature
HTTP-only scrapers load zero CSS/JS/images — broken DOM after 200ms
speechSynthesis.getVoices() returns 0 — headless Chrome telltale
Anti-detect browsers inject canvas noise — double-render detects mismatch
hardwareConcurrency = 0 — real devices report 2–32 cores
setTimeout(0) returns in 0ms — physically impossible in real browser
100 DOM inserts < 0.5ms — headless renders ~10× faster
AWS, GCP, Azure, Hetzner, DigitalOcean — verified via 3 IP providers in parallel
ipinfo.io VPN/proxy classification + 15 known residential proxy networks
Tor exit list cross-checked on every request
Browser claims Vienna, IP geolocates to USA — proxy rotation tell
Catches what rules can't.
Six engines run after the rule layer. They surface professional scrapers operating residential proxies, anti-detect browsers, and rotation infrastructure.
Behavioral mismatch
Residential IP + no mouse / scroll / clicks = professional scraper using rotation proxy (Bright Data, Oxylabs).
Canvas double-render
Anti-detect browsers inject canvas noise. We render twice — if hashes differ, noise injection is confirmed.
Cross-signal correlation
Chrome UA but wrong JS engine? Mobile UA but desktop hover? 10+ signal pairs checked for consistency.
Request distribution
Human request timing is log-normal (varied). Bot timing is uniform (Math.random). KS test detects the difference.
Mouse entropy
Real hands produce high-entropy angles + tremor. Emulated mice show low entropy + zero micro-jitter.
Session coherence
3+ pages without a single interaction (click, scroll, keystroke) = bot navigation, not human reading.