Version 5.0.30a of screen-scraper Released

This release contains just one feature enhancement: when creating a new extractor pattern token screen-scraper will now attempt to guess the regular expression that should be used.  Based on initial testing, screen-scraper seems to be able to correctly guess most of the time.  This is a small feature, but will hopefully save a fair amount … Read moreVersion 5.0.30a of screen-scraper Released

Enterprise-Scale Screen-Scraping

One of the main aspects that I think differentiates screen-scraper from many other solutions is its ability to handle large-scale scraping needs.  Additionally, it was designed from the ground up to integrate with other systems, so it generally fits nicely into most any existing setup. If you’re doing a simple one-off data extraction project screen-scraper … Read moreEnterprise-Scale Screen-Scraping

Version 5.0.28a of screen-scraper Released

Changes: Based on feedback, now allowing running the screen-scraper workbench and server simultaneously by adding the “AllowMultipleSimultaneousInstances” property to the screen-scraper.properties file. Fixed a bug where screen-scraper would freeze up when very large requests were included in proxy sessions and scrapeable files. Fixed a bug where space characters in URL’s would generate an error.

To Anonymize or to Not Anonymize

Lately we find an increasing need to anonymize our scraping sessions. So, as necessity is the mother of invention, we have created and adopted a handful of different approaches to keep our scrapes up and running. Keep in mind, the only way to block a web crawler is for a website’s server to refuse connections … Read moreTo Anonymize or to Not Anonymize

Data Cravings

Yesterday ReadWriteWeb published an article entitled “Overwhelmed Executives Still Crave Big Data, Says Survey“.  The basic gist of it is that data is vital to making business decisions, and many managers feel that they don’t have enough of it.  This got me thinking about how screen-scraping plays into all of this. At a basic level, … Read moreData Cravings

Version 5.0.27a of screen-scraper Released

Just a few changes in this one: Fixed a scrolling bug related to displaying script instances associated with extractor patterns. Removed a log message that was appearing each time a redirect occurred. screen-scraper will now display a “start page” when the workbench initially launches. The start page will hopefully be especially helpful for newer users. … Read moreVersion 5.0.27a of screen-scraper Released