You're trying to scrape a Cloudflare-protected site with Python. Your requests get a 403. cloudscraper doesn't work. Neither does rotating user agents. Here's the definitive guide to what works in 2026.
pip install playwright
playwright install chromium
import asyncio
from playwright.async_api import async_playwright
import os
PROXY_HOST = os.getenv('PROXY_HOST', 'brd.superproxy.io')
PROXY_PORT = os.getenv('PROXY_PORT', '22225')
PROXY_USER = os.getenv('PROXY_USER')
PROXY_PASS = os.getenv('PROXY_PASS')
async def scrape_cloudflare_site(url: str) -> str:
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
proxy={
'server': f'http://{PROXY_HOST}:{PROXY_PORT}',
'username': PROXY_USER,
'password': PROXY_PASS
}
)
ctx = await browser.new_context(
# iPhone 15 Pro fingerprint — low bot score
user_agent='Mozilla/5.0 (iPhone; CPU iPhone OS 17_4_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4.1 Mobile/15E148 Safari/604.1',
viewport={'width': 393, 'height': 852},
is_mobile=True,
has_touch=True,
locale='en-US',
timezone_id='America/New_York',
ignore_https_errors=True, # Required for proxy SSL
)
page = await ctx.new_page()
# Patch webdriver detection
await page.add_init_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
await page.goto(url, wait_until='networkidle', timeout=30000)
content = await page.content()
await browser.close()
return content
# Usage
html = asyncio.run(scrape_cloudflare_site('https://target-site.com'))
print(html[:500])
cloudscraper was built to solve a specific problem: Cloudflare's old JS challenge (IUAM — "I'm Under Attack Mode"). That challenge worked by:
cloudscraper replicated this JS computation in Python. But Cloudflare deprecated this challenge system in 2023 and replaced it with:
cloudscraper can't solve any of these. It will work on very old Cloudflare configurations but fails on any site that has updated their plan in the last 2 years.
If you're already in a Python codebase, you can use human-browser (Node.js) as a subprocess and communicate via HTTP or stdin/stdout:
import subprocess
import json
import asyncio
async def fetch_via_human_browser(url: str) -> dict:
"""Use human-browser Node.js script for Cloudflare bypass."""
script = f"""
const {{ launchHuman }} = require('human-browser');
(async () => {{
const {{ page }} = await launchHuman();
await page.goto('{url}', {{ waitUntil: 'networkidle' }});
const title = await page.title();
const content = await page.content();
console.log(JSON.stringify({{ title, contentLength: content.length }}));
process.exit(0);
}})();
"""
result = subprocess.run(
['node', '-e', script],
capture_output=True, text=True, timeout=60,
env={{**os.environ, 'PROXY_USER': os.getenv('PROXY_USER')}}
)
return json.loads(result.stdout)
data = asyncio.run(fetch_via_human_browser('https://cloudflare-protected-site.com'))
print(data)
For sites without advanced bot protection (or with basic Cloudflare rules that only check IP), httpx with a residential proxy can work:
import httpx
proxy_url = f"http://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}"
with httpx.Client(
proxies={"http://": proxy_url, "https://": proxy_url},
verify=False, # Required for proxy SSL interception
headers={
'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 17_4_1 like Mac OS X) AppleWebKit/605.1.15',
'Accept': 'text/html,application/xhtml+xml',
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
}
) as client:
resp = client.get('https://target.com')
print(resp.status_code, len(resp.text))
This works on ~30% of Cloudflare sites — specifically those on the free plan or those that have only enabled rate limiting (not Bot Management). It will fail on any site using Cloudflare Enterprise or Bot Management.
Residential proxy + iPhone fingerprint. Use from Python, Node.js, or any HTTP client via proxy. From $13.99/mo.
Get Started →