TypeScript:
Downloading a web page
How to:
You can download a web page in TypeScript using Node.js and the node-fetch
library. Here’s how:
import fetch from 'node-fetch';
async function downloadWebPage(url: string): Promise<void> {
try {
const response = await fetch(url);
const body = await response.text();
console.log(body); // This prints the HTML content to the console
} catch (error) {
console.error('Download failed:', error);
}
}
// Use the function
downloadWebPage('https://example.com');
Sample output (truncated):
<!doctype html>
<html>
<head>
<title>Example Domain</title>
...
</html>
Deep Dive
Historically, web content was downloaded via tools like wget
or curl
in command-line environments. In modern programming, however, we have libraries such as node-fetch
, axios
, or request
(deprecated but still in use) that provide more functionality and are easier to integrate into our JavaScript/TypeScript applications.
When downloading a web page, there’s more than the HTML. CSS, JavaScript, images, and other assets are part of the deal. Usually, just the HTML is grabbed first, and then any additional processing or downloading is dictated by what you need from the page.
In terms of implementation, node-fetch
is essentially window.fetch API for Node.js. It returns a promise that resolves to the response of the request, allowing you to either get a text stream (.text()), a JSON object (.json()), or even a buffer (.buffer()) for binary data.
Keep in mind that web scraping rights are dictated by a website’s robots.txt
file and terms of service. Always verify that you’re allowed to scrape a site and respect rate limits to avoid legal issues or getting your IP banned.