screen-scraper logo The web data extraction experts Search
Navigation See our products Contact us Buy screen-scraper professional Help and support Download screen-scraper Free quote Navigation
Example Scraping Sessions
One of our internal projects here at screen-scraper.com is a scrapbooking meta-search site called ScrapbookFinds. We use our very own screen-scraper software to extract scrapbooking product information from various retailer web sites, insert all of the data into a database, then allow it to be searched in a single location.
In order to give other users example scraping sessions to refer to when learning screen-scraper, we've decided to provide a peek behind the scenes into how we go about scraping the products. There are a few things to be aware of when using these examples, however:
  • The scraping sessions will not be completely functional, as they require a database in order to save the information. As such, when you import and run them you'll notice error messages showing up in your log. These should all be benign (i.e., they're supposed to be there; don't worry about them). The point is to give you examples of scraping sessions so that you can examine how we handle traversing pages, structuring extractor patterns, etc.
  • We're obviously scraping live sites with these scraping sessions, and we have no control over how those sites change. We update these scraping sessions on occasion when their target sites change. We'll do our best to keep these examples up-to-date, but it may be that you run one or two of the scraping sessions and they don't work because the target site has changed.
  • A few jar files will be copied into your "lib\ext" folder. Having these jar files still won't make the scraping sessions completely functional (again, they'd need a database for that), but it does make it so that the error messages you see aren't quite so large.
With that big disclaimer, and without further ado...
Download the examples here (1.5M)
In order to use them, simply unzip the file, and copy the contents into your screen-scraper install folder. The "import" folder contains five scraping sessions, and the "lib\ext" folder contains the jar files the scripts refer to. Once you launch screen-scraper the scraping sessions will automatically get imported. After that you can run any of the scraping sessions to see how they work. You'll want to pay special attention to the logs that get generated, which will help you follow the program flow of each.
If you have any questions when looking through the sessions, please post to our support forum so that others can learn from your questions as well.
© 2002-2008 copyright e-kiwi, LLC
about us | blog | contact us | legal