C++:
Downloading a web page
How to:
In the current C++ version, you can use the CURL
library to download web content. Here’s a basic example:
#include <curl/curl.h>
#include <iostream>
#include <string>
static size_t writeCallback(void* contents, size_t size, size_t nmemb, void* userp){
((std::string*)userp)->append((char*)contents, size * nmemb);
return size * nmemb;
}
int main() {
CURL* curl;
CURLcode res;
std::string readBuffer;
curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://example.com");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
if(res == CURLE_OK) {
std::cout << readBuffer << std::endl;
}
else {
std::cerr << "CURL Error: " << curl_easy_strerror(res) << std::endl;
}
}
return 0;
}
Sample output:
<!doctype html>
<html>
<head>
<title>Example Domain</title>
...
</head>
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is for use in illustrative examples in documents. You may use this domain ...</p>
</div>
</body>
</html>
Deep Dive
Originally, there was no standard way to download web pages with just C++. Programmers used platform-specific solutions or various third-party libraries. Now, libcurl
is a widely supported and versatile library for transferring data with URLs. Compiled and linked with your C++ code, curl is a go-to tool.
Alternatives to libcurl include Poco’s HTTPClientSession and C++ Rest SDK (aka Casablanca). While libcurl is C-based and about as low-level as you can comfortably go in terms of HTTP requests, Poco and Casablanca offer more idiomatic C++ interfaces which some may prefer.
Under the hood, when you download a web page, the HTTP protocol kicks into action. A GET request is sent to the server, and assuming all goes well, the server responds with the content wrapped in an HTTP response.