|
screen-scraper needs to be running as a server before invoking screen-scraper from a Ruby script. Please read that section now, if you haven't already.
A Ruby script interacts with screen-scraper via a Ruby class called "RemoteScrapingSession". You can utilize this class by importing the module "remote_scraping_session.rb" (found in the misc/ruby directory of your screen-scraper installation or downloadable here) within your Ruby script.
Full documentation on all of the methods found in the RemoteScrapingSession class is given below:
-
initialize( name ). Initializes a
RemoteScrapingSession identified by name. If this constructor is called the default host (localhost) and port (8778) will be used.
-
initialize( name, host, port ). Instantiates a
RemoteScrapingSession identified by name, and connecting to the server found at host listening on port.
-
setVariable( var_name, value ). Sets a session variable using the given
name and value.
-
scrape. Causes the session to scrape. This is equivalent to clicking the "Run Scraping Session" button from within screen-scraper on the "General" tab for a scraping session.
-
getVariable( var_name ). Gets the value of a session variable that was set during the course of the scraping session. If the object identified by var_name is a data record an associative array will be returned. If the object identified by var_name is a data set a two-dimensional ordinal array of associative arrays will be returned (see our fourth tutorial for an illustration of this). Note that currently only Strings, DataRecords, and DataSets can be accessed by this method.
-
setBufferSize( buffer_size ). Explicitly sets the size of the buffer (in bytes) that will be used when reading data from screen-scraper. The default buffer size is 1024 bytes, so if you're anticipating a large amount of data (such as when receiving a full data set) you'll want to increase this value.
-
resetBufferSize. Resets the size of the buffer back to its default size of 1024 bytes.
-
isError. Indicates whether or not an error has occurred in the scraping process.
-
getErrorMessage. Returns the last error message returned from the server, if one was returned.
-
disconnect. Disconnects from the remote server. This should be called once a scraping session is complete so that system resources can be freed up.
-
getNumDataRecordsInDataSet( data_set_name ). Returns the number of data records found in the data set named by
data_set_name.
-
getDataRecordFromDataSet( data_set_name, index ). Returns a single data record (a hash array) from the data set named by
data_set_name at the given index.
-
setDoLazyScrape( doLazyScrape ). Indicates whether or not a scraping session should be run in a separate thread. By default this value is false. Note that calling this method will only have an effect if it's done before calling the
scrape method. If this value is set to true, after the scrape method is called, program flow will return immediately, but the scraping session will still be run by screen-scraper.
From here:
|