Scrapify Blog

Engineering & insights

Deep dives, tutorials, and engineering notes from the team building Scrapify.

Engineering★ Featured

How We Eliminated 504 Timeouts with Async Job Queuing

The classic serverless scraping problem: your function times out before the headless browser finishes. Here's the architecture that fixed it completely.

Umer Javed

April 10, 2026·8 min read

LLM Schema Extraction: From Messy HTML to Clean JSON

We use Groq's LLaMA models to parse raw scraped HTML into typed JSON objects. This post explains the prompt engineering and schema design behind it.

Anees

Mar 28·6 min read

Tutorial

Scraping JavaScript SPAs with Playwright in 2026

React, Vue, and Angular apps make scraping harder. We walk through every technique we use — waitForSelector, networkidle, and lazy-load scrolling.

Umer Javed

Mar 15·10 min read

Guides

Building a Competitor Price Tracker in 30 Minutes

A step-by-step walkthrough of setting up a weekly price extraction job, storing results in the dashboard, and querying the data with Chat.

Anees

Mar 2·5 min read

Opinion

Ethical Web Scraping: Our Approach to Responsible Data Collection

Rate limiting, robots.txt compliance, and data minimisation aren't optional extras — they're core to how Scrapify works by default.

Umer Javed

Feb 18·7 min read

RAG on Scraped Data: Semantic Search That Actually Works

How we chunk, embed, and store scraped content so users can ask plain English questions and get grounded, accurate answers.

Anees

Feb 5·9 min read

Want to stay updated?