Scrapify Blog

Engineering & insights

Deep dives, tutorials, and engineering notes from the team building Scrapify.

Engineering★ Featured

How We Eliminated 504 Timeouts with Async Job Queuing

The classic serverless scraping problem: your function times out before the headless browser finishes. Here's the architecture that fixed it completely.

UJ
Umer Javed
April 10, 2026·8 min read
AI

LLM Schema Extraction: From Messy HTML to Clean JSON

We use Groq's LLaMA models to parse raw scraped HTML into typed JSON objects. This post explains the prompt engineering and schema design behind it.

AN
Anees
Mar 28·6 min read
Tutorial

Scraping JavaScript SPAs with Playwright in 2026

React, Vue, and Angular apps make scraping harder. We walk through every technique we use — waitForSelector, networkidle, and lazy-load scrolling.

UJ
Umer Javed
Mar 15·10 min read
Guides

Building a Competitor Price Tracker in 30 Minutes

A step-by-step walkthrough of setting up a weekly price extraction job, storing results in the dashboard, and querying the data with Chat.

AN
Anees
Mar 2·5 min read
Opinion

Ethical Web Scraping: Our Approach to Responsible Data Collection

Rate limiting, robots.txt compliance, and data minimisation aren't optional extras — they're core to how Scrapify works by default.

UJ
Umer Javed
Feb 18·7 min read
AI

RAG on Scraped Data: Semantic Search That Actually Works

How we chunk, embed, and store scraped content so users can ask plain English questions and get grounded, accurate answers.

AN
Anees
Feb 5·9 min read

Want to stay updated?

Blog — Scrapify | Scrapify