Skip to content
← ALL RECEIPTS

Recipe Scraper / Q1 2025

48,000 Recipes Scraped at 99.98 Percent Success Rate.

Async scraping across 15 sources with rate limiting, deduplication, and a Flask UI.

// The Build

Async scraping via aiohttp with bounded concurrency and polite rate limiting. Three-retry logic with exponential backoff for network failures. URL-based deduplication with checkpoint persistence and separate failed URL tracking.

Multiple recipe aggregation sources supported through the recipe-scrapers library. Each recipe gets normalized into a unified schema with structured ingredient parsing (converting '2 cups flour' into quantity, unit, and name), automatic search term generation, and fractional quantity support.

Flask web UI for managing imports and browsing the collection. The data feeds directly into the Hearthlight meal planning platform.

// outcome

48K+ recipes collected with near-perfect reliability. Zero duplicates.

48K+

Recipes scraped

99.98%

Success rate

15

Sources

0

Duplicates

Got a similar problem? Let’s talk.