Enterprise-Ready
Before choosing a scraping solution for your enterprise,
compare others to screen-scraper:
Scales up gracefully
- Hundreds of scraping agents can be run concurrently on a single machine, limited only by hardware and bandwidth.
- Distribute scraping agents across multiple computers.
- Monitor and control scraping from a central interface.
Rock solid
- Very mature--has been under continuous development since 2002. We've had plenty of time to work out the bugs.
- Can handle just about any web site you throw at it, including sites that use AJAX.
- Scraping processes can run uninterrupted for weeks or months.
Scraping in the cloud
- Scraping agents can be spawned on-demand, and scaled up and down arbitrarily.
- Distributed scraping agents can work together and be managed centrally.
- Has already been integrated with Amazon's AWS suite.
Integrates with your existing system
- Invoke screen-scraper from .NET, Java, Ruby, PHP, or just about any other language.
- Runs on Windows, Linux, Mac OS X, and any other OS that supports Java.
- Can call out to external applications using a variety of methods and API's.
Flexibility
- Internal scripting engine allows for custom logic in scraping routines.
- Extracted data can be sent on the fly to databases, files, or other external applications.
- Can interact in real-time with external systems, such as databases.
Speed
- Very lightweight--designed from the ground up to minimize use of system resources and optimize bandwidth usage.
- Grabs only pages it needs from web sites. Doesn't download JavaScript, CSS, or images unless told otherwise.
- Doesn't process JavaScript or rely on an underlying web browser engine.
Monitoring and reporting
- Monitor the scraping process via a built-in web interface.
- Programatically monitor scrapes via a variety of API's, including number of records scraped, memory usage, etc.
- Easily view and generate reports on error conditions, scraping results, etc.












