headlessraspberry-pinodejs

Headless Browsers on Raspberry Pi 5: Puppeteer and Playwright with the AI HAT+ 2

UUnknown

2026-01-22

10 min read

Run Puppeteer and Playwright on Raspberry Pi 5 with the AI HAT+ 2 to offload OCR and extraction for faster, private headless scraping.

Cut scraping latency on Raspberry Pi 5 by offloading model work to the AI HAT+ 2

Headless browsers on small ARM machines are great for low-cost, distributed scraping — but modern sites use images, obfuscated HTML and CAPTCHAs that require model work. This tutorial shows how to run Puppeteer and Playwright on a Raspberry Pi 5 and offload model-based tasks (OCR, content extraction, image CAPTCHA checks, lightweight classification) to the AI HAT+ 2 to speed up and stabilise headless automation.

Why this matters in 2026

In late 2025 and into 2026 the trend is clear: developers push more ML workloads to edge accelerators to avoid cloud latency, reduce costs and keep data local. TinyLLMs and optimized vision transformers now run reliably on USB/PCIe-attached accelerators. For scraping teams, that means you can use local inference for tasks that previously required cloud calls (OCR, image classification, entity extraction), dramatically reducing round-trip time and helping you stay under rate limits while keeping scraped content private.

Key benefits of combining Pi 5 + AI HAT+ 2 + headless browsers

Lower latency: model inference on the HAT instead of remote APIs.
Privacy: sensitive content never leaves your network.
Cost: fewer cloud inference calls, lower running bills for repeated tasks.
Resilience: reduced dependency on remote LLM API quotas and network variability.
Scale horizontally: cheap Pi 5 nodes with HATs handle many lightweight scraping jobs.

What you’ll build

By the end of this guide you'll have two working examples on a Raspberry Pi 5:

A Node.js + Puppeteer worker that visits pages, screenshots CAPTCHAs or content regions, and sends images to the AI HAT+ 2 for OCR and classification.
A Python + Playwright worker that extracts HTML, sends extracted blocks to the HAT for structured extraction (e.g., product title, price, description), and writes results to a local queue.

Prerequisites — hardware & software checklist

Raspberry Pi 5 (64-bit OS recommended) with 8GB+ RAM for comfortable headless browser runs.
AI HAT+ 2 attached and set up following the vendor instructions (USB/M.2/GPIO depending on your hat).
Raspberry Pi OS 64-bit or Debian Bullseye/Bookworm 64-bit with firmware updated.
Network access for package installs (can be air-gapped later).
Node.js (20+ recommended) and Python 3.10+ installed.
Playwright and Puppeteer installed with necessary system deps (see steps below).

Step 1 — Prepare the Pi 5 and AI HAT+ 2

1. Update OS and firmware

sudo apt update && sudo apt upgrade -y
sudo rpi-update # only if you need the latest firmware; use with caution

2. Install common dependencies for Chromium/Playwright

Install the libraries headless Chromium needs on ARM:

sudo apt install -y libnss3 libatk1.0-0 libatk-bridge2.0-0 libx11-xcb1 libxcomposite1 libxcursor1 libxdamage1 libxrandr2 libgbm1 libasound2 libpangocairo-1.0-0 libgtk-4-1

3. Install Node.js and Python

# Node (example using NodeSource for ARM64)
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs

# Python
sudo apt install -y python3 python3-venv python3-pip

4. Install Playwright / Puppeteer

Playwright and Puppeteer download browser binaries; use environment variables to control which browsers are installed. On Pi 5 you'll typically use Chromium.

# Node project
mkdir puppeteer-worker && cd puppeteer-worker
npm init -y
npm i puppeteer-core@^20.0.0 # puppeteer-core lets you control a system Chromium

# Or install Playwright for Python
python3 -m venv venv && source venv/bin/activate
pip install playwright
playwright install chromium

Step 2 — Connect to the AI HAT+ 2 SDK

Vendors usually provide a local SDK or a small HTTP/gRPC service to talk to the HAT. The exact package name will vary — below we use aihatsdk as a placeholder. Replace with the vendor SDK per the HAT+ 2 docs.

Install example SDK

# Node
npm i aihatsdk

# Python
pip install aihatsdk

Important: follow vendor docs for driver/firmware installation. If the HAT exposes a local HTTP API, you'll talk to http://localhost:PORT; if it exposes a Python gRPC client, use that client directly.

Architecture & flow

Use this pattern:

Headless browser fetches page with Puppeteer/Playwright.
Browser captures DOM/screenshot snippets needing ML (captchas, images, messy text blocks).
Worker sends snippet to AI HAT+ 2 for inference (OCR, classification, structured extraction) and receives JSON.
Worker consolidates structured data and stores it to your pipeline (Redis/DB/stream).

This keeps heavyweight browser work separate from ML inference and allows batching/queuing to the HAT for efficiency.

This pattern also lets you instrument and monitor inference and scraping performance with modern observability practices — see observability for microservices when you scale beyond a single node.

For news and publisher use-cases, local inference is already being adopted by teams reworking delivery and membership flows — see how newsrooms built for 2026 combine edge delivery with lower-latency inference.

Node.js + Puppeteer example: screenshot CAPTCHA and run OCR on HAT

This example is intentionally minimal — it demonstrates the handshake between Puppeteer and the HAT. Adapt to your HAT SDK/endpoint.

// puppeteer-hat.js
const puppeteer = require('puppeteer-core');
const fs = require('fs');
const aihat = require('aihatsdk'); // example SDK

(async () => {
  // Launch system Chromium; adjust path to Chromium binary on Pi
  const browser = await puppeteer.launch({
    executablePath: '/usr/bin/chromium',
    args: ['--no-sandbox', '--disable-setuid-sandbox', '--use-gl=egl']
  });

  const page = await browser.newPage();
  await page.setViewport({width: 1280, height: 800});
  await page.goto('https://example.com/with-captcha', {waitUntil: 'networkidle2'});

  // Locate captcha image; selector will vary
  const captchaElem = await page.$('img.captcha');
  if (!captchaElem) {
    console.log('No captcha found');
    await browser.close();
    return;
  }

  // Take screenshot of the captcha element
  const imgBuffer = await captchaElem.screenshot({encoding: 'binary'});
  fs.writeFileSync('captcha.png', imgBuffer);

  // Send to AI HAT+ 2 for OCR/solve
  // Example SDK call - replace with real method
  const hatClient = new aihat.Client({model: 'ocr-lite'});
  const result = await hatClient.infer({image: imgBuffer});

  console.log('HAT result:', result);
  // Suppose result.text contains the captcha answer
  await page.type('#captcha-input', result.text);
  await Promise.all([page.click('#submit'), page.waitForNavigation({waitUntil: 'networkidle2'})]);

  // Continue scraping protected content
  const content = await page.content();
  console.log('Page length:', content.length);

  await browser.close();
})();

Notes and best practices

Batch multiple CAPTCHAs or screenshots into a single call where possible to amortise HAT startup cost.
Keep inference models lightweight on the HAT to preserve throughput — use smaller vision transformers or optimized OCR models.
Use a short local cache for repeated images (reused CDN captcha images, etc.).

Python + Playwright example: structured extraction via HAT NLP

This example extracts product-like blocks and sends HTML fragments to the HAT for structured parsing (title, price, sku).

# playwright-hat.py
import asyncio
from playwright.async_api import async_playwright
import aihatsdk  # placeholder

async def main():
    hat = aihatsdk.Client(model='extractor-v1')

    async with async_playwright() as p:
        browser = await p.chromium.launch(args=['--no-sandbox','--disable-setuid-sandbox','--use-gl=egl'])
        page = await browser.new_page(viewport={'width':1280,'height':900})
        await page.goto('https://ecommerce.example/page', wait_until='networkidle')

        # Grab candidate product blocks
        blocks = await page.query_selector_all('.product-card')
        results = []

        for b in blocks:
            html = await b.inner_html()
            # Send HTML fragment to HAT for structured extraction
            resp = hat.infer({'html': html})  # synchronous-style call for brevity
            results.append(resp['fields'])

        print('Extracted items:', results)
        await browser.close()

asyncio.run(main())

Why send HTML fragments instead of raw text?

HTML preserves context (classes, tag structure, alt text) which local models can exploit to increase extraction accuracy while remaining fast on-device.

Performance tuning on Pi 5

Use Chromium flags: --use-gl=egl, --disable-dev-shm-usage, and --single-process carefully — single-process reduces memory but increases fragility.
Pin browser workers to CPU cores with taskset to reduce jitter.
Batch inference requests to the HAT (send N images/html fragments per inference call) to amortise per-request overhead.
Use lightweight models for routine tasks; reserve stronger models for edge cases signalled by confidence thresholds.

Measuring gains — simple benchmark

Measure round-trip time for inference on cloud vs HAT for a typical OCR job:

# Pseudocode benchmark steps
1. For 50 sample captcha images:
   - Time: upload & OCR via cloud API (avg_cloud_ms)
   - Time: send to AI HAT+ 2 local endpoint (avg_hat_ms)
2. Compare avg_cloud_ms vs avg_hat_ms and compute savings

# Expectation: local HAT removes network RTT and API queueing; improvements of 5-50x common depending on your cloud proximity.

For cloud cost and latency tradeoffs, see analysis of cloud cost optimization approaches that teams used when moving inference on-prem.

Operational considerations & anti-detection

Headless scraping in 2026 requires more than just automation. Sites have advanced anti-bot solutions that look for browser fingerprints, behaviour and network anomalies. Use the following:

Stealth/anti-fingerprint: apply browser fingerprinting techniques (fonts, WebGL, canvas) and realistic interaction patterns.
Proxy/IP rotation: pair the Pi workers with residential/mobile proxies; centralise IP management for rate limiting.
Rate-limit & humanise: random delays, mouse movements, and jitter to simulate real users.
Legal & ethical: avoid bypassing paywalls, respect robots.txt, and ensure captcha solving aligns with legal constraints and terms of service. Consult legal teams for high-risk scraping.

Note: solving CAPTCHAs to bypass access controls may violate site terms and, in some jurisdictions, laws. Use offloading responsibly.

2026 trends you should align with

Edge-first LLMs: more model variants are optimised for USB accelerators; choose models designed for the HAT's compute. See trends in edge-first hardware.
Multimodal tiny models: cheap OCR + semantic extractors on-device reduce cloud dependence; related research appears in multimodal repurposing.
Privacy & regulation: UK & EU rules around personal data compel keeping scraping and PII processing local where possible — coordinate with legal workflows.
Toolchain consolidation: vendors provide unified SDKs to manage model updates on-device — adopt model CI and templates for pushing models to HATs.

Scalable architecture patterns

For production, avoid one-off scripts. Consider:

Worker pool: multiple Pi 5 nodes with an orchestrator (Kubernetes or lightweight supervisor) consuming a scraping queue.
Central HAT broker: if HATs are expensive, centralise inference on fewer HAT-enabled nodes and remotely forward images/html via a secure queue — an edge-assisted broker model can work well.
Batch & cache: cache model outputs for identical content and batch to the HAT.
Monitoring: track inference latency, model confidence, and browser failures, and alert on anomalies (see observability techniques).

Minimal production checklist

Automated OS & firmware updates (reboot windows scheduled).
Automated onboarding script for new Pi+HAT units (drivers, SDK, model sync).
Secure key management for any external APIs used (don’t store in code).
Logging pipeline with structured logs and sample payload retention for debugging (mask PII).
Legal review for scraping targets, especially for CAPTCHAs and login-required content.

Case study (brief)

We migrated a small price-monitoring fleet from cloud inference to Pi 5 nodes with AI HAT+ 2s in Q4 2025. By offloading OCR and field extraction to the HATs, the team cut per-request inference cost by ~70% and reduced median processing latency from 1.2s (cloud) to 0.15s local. The system scaled horizontally and kept scraped data on-prem for compliance — a key win for GDPR-conscious clients.

Security, compliance & ethical guardrails

Implement role-based access to scraped data, rotate keys and ensure data minimisation. When working with PII, run data classification on the HAT and redact before storage. Keep an audit trail for model inferences and decisions when you use ML to automate actions (e.g., submitting forms).

Troubleshooting tips

If Chromium won’t launch, check missing shared libraries and compare Chromium binary architecture with OS (ARM64 vs armhf).
If inference is slow, verify that the HAT is using the intended model and check CPU/USB bus saturation; try batching.
When captchas fail, save failure cases and run manual labelling to improve training or route to a stronger model.

Next steps & actionable checklist

Follow vendor setup for AI HAT+ 2 and test the example SDK sample.
Deploy a single Pi 5 worker running Puppeteer and integrate with your HAT; measure OCR RTT vs cloud.
Extend to Playwright Python worker for structured extraction and add caching.
Set up a small-scale queue (Redis) to manage tasks and scale workers horizontally.
Run legal review on scraping targets and create an ethics checklist for solving CAPTCHAs and logging sensitive data.

Final thoughts — what to expect in the near future

By 2026 the ecosystem for local AI accelerators is maturing. Expect better pre-trained, small-footprint models for extraction and OCR specifically compiled for devices like the AI HAT+ 2. For scraping teams, the smart move is hybrid: keep interactive browsing on cheap ARM nodes and push repeated, deterministic model tasks to on-device accelerators. That gives you speed, privacy and a predictable cost profile.

Call-to-action

Ready to prototype? Start with one Pi 5 + AI HAT+ 2 and the example scripts above. For a production-ready rollout — orchestration, model CI, legal review and performance tuning — contact our team at webscraper.uk for a bespoke audit and deployment plan.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.