JavaScript:
Downloading a web page

How to:

Here’s a quick way to download a page using Node.js with node-fetch:

const fetch = require('node-fetch'); // you might need to install this first!

async function downloadPage(url) {
    try {
        const response = await fetch(url);
        const body = await response.text();
        console.log(body); // Outputs the HTML source of the page
    } catch (error) {
        console.error(error);
    }
}

downloadPage('https://example.com');

Sample output:

<!doctype html>
<html>
<head>
    <title>Example Domain</title>
...
</html>

Deep Dive

Historically, downloading a web page was done with XMLHTTPRequest in the browser or http module in Node.js. However, post-ES6, fetch API became the modern standard due to its easier syntax and promise-based nature.

Alternatives include axios, a popular npm package, which handles requests with a bit more functionality than native fetch. For complex use cases, you might use puppeteer to actually render the page in a headless browser, useful for dealing with JavaScript-rendered content.

When implementing page downloads, pay attention to aspects like respecting robots.txt, handling User-Agent to avoid getting blocked, and managing asynchronous handling carefully to dodge potential pitfalls with server overload or race conditions.

See Also