Scraper API provides a proxy service designed for web scraping. With over 20 million residential IPs across 12 countries, as well as software that can handle JavaScript rendering and solving CAPTCHAs, you can quickly complete large scraping jobs without ever having to worry about being blocked by any servers.
Implementation is extremely simple, and they offer unlimited bandwidth. Proxies are automatically rotated, but users can choose to maintain sessions if required. All you need to do is call the API with the URL that you want to scrape, and it will return the raw HTML. With Scraper API, you just focus on parsing the data, and they’ll handle the rest.
As per data, they have handled 5 billion API requests per month for over 1,500 businesses and developers around the world
One of the most frustrating parts of automated web scraping is constantly dealing with IP blocks and CAPTCHAs. ScrapperAPI handles it beautifully. You can customize request headers, request type, IP geo-location and more. They automatically prune slow proxies from our pools periodically, and guarantee unlimited bandwidth with speeds up to 100Mb/s, perfect for writing speedy web crawlers.
Features loaded:
When you sign up for Scraper API you are given an access key. All you need to do is call the API with your key and the URL that you want to scrape, and you will receive the raw HTML of the page as a result. It’s as simple as:
curl "https://api.scraperapi.com?api_key=XYZ&url=https://httpbin.org/ip"
On the back end, when Scraper API receives your request, their service accesses the URL via one of their proxy servers, gets the data, and then sends it back to you.
Scraper API exposes a single API endpoint, simply send a GET request to https://api.scraperapi.com with two query string parameters, api_key which contains your API key, and url which contains the url you would like to scrape.
/* Node.Js */
const scraperapiClient = require("scraperapi-sdk")("XYZ");
const response = await scraperapiClient.get("https://httpbin.org/ip");
logger.info(response);
/* JAVA */
// remember to install the library: https://search.maven.org/artifact/com.scraperapi/sdk/1.0
import com.scraperapi
ScraperApiClient client = new ScraperApiClient("XYZ");
client.get("https://httpbin.org/ip")
.result();
<html>
<head> </head>
<body>
<pre style="word-wrap: break-word; white-space: pre-wrap;">
{"origin":"176.12.80.34"}
</pre>
</body>
</html>
To ensure your requests come from the United States, please use the countrycode= flag (e.g. countrycode=us)
curl "https://api.scraperapi.com/?api_key=XYZ&url=https://httpbin.org/ip&country_code=us"
Some advanced users will want to issue POST/PUT Requests in order to scrape forms and API endpoints directly.
# Replace POST with PUT to send a PUT request instead
curl -d 'foo=bar' \
-X POST \
"https://api.scraperapi.com/?api_key=XYZ&url=https://httpbin.org/anything"
# For form data
curl -H 'Content-Type: application/x-www-form-urlencoded' \
-F 'foo=bar' \
-X POST \
"https://api.scraperapi.com/?api_key=XYZ&url=https://httpbin.org/anything"
{
"args": {},
"data": "{\"foo\":\"bar\"}",
"files": {},
"form": {},
"headers": {
"Accept": "application/json",
"Accept-Encoding": "gzip, deflate",
"Content-Length": "13",
"Content-Type": "application/json; charset=utf-8",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
},
"json": {
"foo": "bar"
},
"method": "POST",
"origin": "191.101.82.154, 191.101.82.154",
"url": "https://httpbin.org/anything"
}
When you log into your Scraper API account, you will be presented with a dashboard that will show you how many requests you have used, how many requests you have left for the month, and the number of failed requests (which do not count towards your request limit).
If you would like to monitor your account usage and limits programmatically (how many concurrent requests you’re using, how many requests you’ve made, etc.) you may use the /account endpoint, which returns JSON.
curl "https://api.scraperapi.com/account?api_key=XYZ"
{
"concurrentRequests": 553,
"requestCount": 6655888,
"failedRequestCount": 1118,
"requestLimit": 10000000,
"concurrencyLimit": 1000
}
Scraper API is the best proxy API service for web scraping in the market today. Easy to integrate, able to accommodate for all levels/sizes of scraping projects. If you have any serious scraping projects, then Scraper API is definitely worth looking into. Even if you’re a casual user, you may benefit from using the free plan.