Scraping PipelineProxy Pool
Scraping Pipeline

Proxy Pool

Webshare residential proxy pool with per-domain scoring and auto-selection

Proxy Pool

Provider: Webshare.io | Mode: Backbone (residential) | Gateway: p.webshare.io

Configuration

  • Pool size: 500 proxies max
  • Countries: US, CA, GB only
  • Rotation: Per-request
  • Cost: ~$7-10/GB
  • Format: http://user:pass@p.webshare.io:80

Key Issues Fixed

Mode Change: Direct → Backbone

Account only supports residential proxies. Old mode=direct (datacenter) returned HTTP 400. Changed to mode=backbone (residential).

VPS DNS Resolution

p.webshare.io returned NXDOMAIN from VPS hosting provider DNS. Fix: Set Google DNS (8.8.8.8) as primary in /etc/systemd/resolved.conf + restart systemd-resolved.

Proxy Scoring Performance

Scoring 500 proxies = 500 DB queries = 120+ second hang. Fix: Sample 20 random proxies instead of all 500. Custom scoring needed because Webshare doesn't provide per-domain performance data.

JobSpy Proxy Behavior

  • Backend ALWAYS injects 1 proxy per JobSpy request
  • If no proxy available, request aborted (never exposes VPS IP)
  • auto_select_proxy=true is default and MUST NOT be disabled in tests