Table of Contents

Creating Scraping Sessions to Work With Oracle Secure Enterprise Search


Overview

Very little needs to change from the way scraping sessions are normally created in screen-scraper™ when using it with Oracle SES. Before proceeding, if you haven't done so already, please go through our first three tutorials in order to get familiar with the way scraping sessions work. Once you've done that, the rest of this document will make a lot more sense.

Sending Scraped Data to Oracle SES

As records are being scraped by screen-scraper™, they can be passed to Oracle SES for indexing. This is done within screen-scraper™ by invoking a script hat calls the session.sendDataToClient method. This allows Oracle SES to receive and index the content in real-time.

More specifically, a DataRecord object should be sent via that method call that contains the following named elements: SES_TITLE, SES_DESCRIPTION, and SES_URL. Remember that a DataRecord is analogous to a row in a spreadsheet or a record in a database table. These specifically named fields will be used by Oracle SES in indexing the information. For example, your screen-scraper™ script might look like this:

If you went through our third tutorial you might remember that you created a script to write the scraped data to a file. That "Write data to a file" script gets invoked each time a record is extracted from our shopping web site. In order to send that same data to Oracle SES, you could create a new script using the code found above and also invoke it "After each pattern application", associated with the same extractor pattern as the "Write data to a file" script. This way, each time a product record is scraped from the shopping web site, it will get sent to Oracle SES for indexing. This is the only change that would need to be made to the "Shopping Site" example scraping session in order to allow Oracle SES to index the data!