What is Scrapoxy ?¶
Scrapoxy hides your scraper behind a cloud.
It starts a pool of proxies to send your requests.
Now, you can crawl without thinking about blacklisting!
How does Scrapoxy work ?¶
- When Scrapoxy starts, it creates and manages a pool of proxies.
- Your scraper uses Scrapoxy as a normal proxy.
- Scrapoxy routes all requests through a pool of proxies.
What Scrapoxy does ?¶
- Create your own proxies
- Use multiple cloud providers (AWS, DigitalOcean, OVH)
- Rotate IP addresses
- Impersonate known browsers
- Exclude blacklisted instances
- Monitor the requests
- Detect bottleneck
- Optimize the scraping
Why Scrapoxy doesn’t support anti-blacklisting ?¶
Anti-blacklisting is a job for the scraper.
When the scraper detects blacklisting, it asks Scrapoxy to remove the proxy from the proxies pool (through a REST API).
What is the best scraper framework to use with Scrapoxy ?¶
Does Scrapoxy have a SaaS mode or a support plan ?¶
Scrapoxy is an open source tool. Source code is highly maintained. You are very welcome to open an issue for features or bugs.
And complete with Tutorials.
You can open an issue on this repository for any feedback (bug, question, request, pull request, etc.).