Scraping Pipeline
Proxy Pool
Webshare residential proxy pool with per-domain scoring and auto-selection
Proxy Pool
Provider: Webshare.io | Mode: Backbone (residential) | Gateway: p.webshare.io
Configuration
- Pool size: 500 proxies max
- Countries: US, CA, GB only
- Rotation: Per-request
- Cost: ~$7-10/GB
- Format:
http://user:pass@p.webshare.io:80
Key Issues Fixed
Mode Change: Direct → Backbone
Account only supports residential proxies. Old mode=direct (datacenter) returned HTTP 400. Changed to mode=backbone (residential).
VPS DNS Resolution
p.webshare.io returned NXDOMAIN from VPS hosting provider DNS. Fix: Set Google DNS (8.8.8.8) as primary in /etc/systemd/resolved.conf + restart systemd-resolved.
Proxy Scoring Performance
Scoring 500 proxies = 500 DB queries = 120+ second hang. Fix: Sample 20 random proxies instead of all 500. Custom scoring needed because Webshare doesn't provide per-domain performance data.
JobSpy Proxy Behavior
- Backend ALWAYS injects 1 proxy per JobSpy request
- If no proxy available, request aborted (never exposes VPS IP)
auto_select_proxy=trueis default and MUST NOT be disabled in tests
Was this page helpful?
Last updated today
Built with Documentation.AI