What is Scrapoxy ?¶
Scrapoxy hides your scraper behind a cloud.
It starts a pool of proxies to send your requests.
Now, you can crawl without thinking about blacklisting!
How does Scrapoxy work ?¶
- When Scrapoxy starts, it creates and manages a pool of proxies.
- Your scraper uses Scrapoxy as a normal proxy.
- Scrapoxy routes all requests through a pool of proxies.
What Scrapoxy does ?¶
- Create your own proxies
- Use multiple cloud providers (AWS, DigitalOcean, OVH)
- Rotate IP addresses
- Impersonate known browsers
- Exclude blacklisted instances
- Monitor the requests
- Detect bottleneck
- Optimize the scraping
Why Scrapoxy doesn’t support anti-blacklisting ?¶
Anti-blacklisting is a job for the scraper.
When the scraper detects blacklisting, it asks Scrapoxy to remove the proxy from the proxies pool (through a REST API).
And complete with Tutorials.
You can open an issue on this repository for any feedback (bug, question, request, pull request, etc.).