How to run Puppeteer on Node.js without limitations
Scraping the web with Node.js is resource-intensive. It is also nearly impossible to achieve on a Serverless platform such as Vercel. Likewise, using a large framework like Playwright is overkill for such a common and essentially simple idea.
Defer, by providing native Puppeteer support, makes it easy to implement scraping use cases. You can also control your scraping jobs with Defer Console.
The code — Wrap Puppeteer as a Defer function
Below, we show you how to use Defer to implement a popular use case: generating thumbnail images from user-provided links with only a few lines of code.
import { defer } from "@defer/client";
import puppeteer from "puppeteer";
const generateLinkThumbnail = async (url: string) => {
// make sure to pass the `--no-sandbox` option
const browser = await puppeteer.launch({ args: ["--no-sandbox"] });
const page = await browser.newPage();
// pass the following `waitUntil` option to avoid
// unnecessary blocking when loading a page
await page.goto(url, { waitUntil: "networkidle0" });
// perform some actions...
// any file must be store in the local `/workspace/` folder
await page.screenshot({ path: "/workspace/screenshot.jpg" });
// save the screenshot to an external storage (e.g., S3)...
await browser.close();
};
export default defer(generateLinkThumbnail, {
retry: 5, // adding retry in case of flakiness issues
});
This is a standard Defer pattern:
- Wrap the functionality in a function. Here, the function
generateLinkThumbnail()
performs all the scraping and thumbnail logic. - Use the Defer client (
defer
) to call the wrapping function and trigger an execution in the background. This is done in the last line of the above code.
Monitoring Defer and Puppeteer
With Defer Console, you can monitor your scraping function. Puppeteer can run for multiple hours:
Finally, Defer Console notifies you in case of delays or failures. By providing job details, the Console allows you to abort the stuck executions or re-run the failed ones:
Was this page helpful?