What is Scrapoxy ?

Scrapoxy hides your scraper behind a cloud.

It starts a pool of proxies to send your requests.

Now, you can crawl without thinking about blacklisting!

It is written in Javascript (ES6) with Node.js & AngularJS and it is open source!

How does Scrapoxy work ?

  1. When Scrapoxy starts, it creates and manages a pool of proxies.
  2. Your scraper uses Scrapoxy as a normal proxy.
  3. Scrapoxy routes all requests through a pool of proxies.

What Scrapoxy does ?

  • Create your own proxies
  • Use multiple cloud providers (AWS, DigitalOcean, OVH)
  • Rotate IP addresses
  • Impersonate known browsers
  • Exclude blacklisted instances
  • Monitor the requests
  • Detect bottleneck
  • Optimize the scraping

Why Scrapoxy doesn’t support anti-blacklisting ?

Anti-blacklisting is a job for the scraper.

When the scraper detects blacklisting, it asks Scrapoxy to remove the proxy from the proxies pool (through a REST API).

What is the best scraper framework to use with Scrapoxy ?

You could use Scrapy framework (Python).



You can open an issue on this repository for any feedback (bug, question, request, pull request, etc.).


See the License.

And don’t forget to be POLITE when you write your scrapers!