|
Release notes for screen-scraper
Public Release 4.0 (01.21.08)
- feature: web interface for scheduling and managing scrapes
- feature: added real-time integration with external applications
- feature: automatic anonymization
- feature: scrapeable files and extractor patterns can be copied and pasted
- feature: added a "notes" to scraping sessions
- feature: improved cookie compatibility
- feature: added sequence to sub-extractor patterns
- feature: scraping sessions can now be run directly from the command line
- feature: HTML entities can now be automatically converted from scraped data
- feature: cookies can be cleared for a scraping session
- feature: last response for a scrapeable file can now be viewed in a browser
- feature: current time and elapsed time can be output in a script
- feature: greatly improved look 'n feel on mac os x
- feature: added new regular expressions
- feature: "update.zip" files will be decompressed and imported
- feature: objects in the tree can be deleted with the "delete" key
- feature: enhanced the "status" bar
- feature: the licensed email address now appears in the "about" screen
- feature: the default file extension for exported objects is now "sss"
- feature: a "start/stop scraping" button was added to the scraping session "log" tab
- feature: HTML can be automatically stripped from extracted data
- feature: screen-scraper can check for updates on startup
- feature: enhanced installers
- bugfix: mappings were not being imported properly from exported scraping sessions
- bugfix: null interpolated session variables were not being properly handled
- bugfix: "deflate" encoding was not being properly handled
- bugfix: in some cases sequence numbers were being duplicated for scrapeable files
- bugfix: in certain cases folders could not be deleted
- bugfix: the proxy server was misidentifying some files as binary
- bugfix: the "last response" tab was blanking out prematurely in some cases
- bugfix: now catching class loader exceptions for jar files compiled with a higher java version
- bugfix: ports weren't being displayed for SSL URL's in the proxy
- bugfix: exceptions thrown in scripts were causing some subsequent scripts not to be executed
- bugfix: various fixes for windows vista
- bugfix: mapping sets were not always being deleted properly
- bugfix: multiple command line instances were not being handled properly
- bugfix: drag 'n drop to folders in some cases wasn't working
- bugfix: double-clicking extractor pattern tokens didn't always allow them to be edited
- bugfix: extractor pattern tokens were getting repeated after editing a token
- bugfix: too high sequence numbers for extractor patterns was causing them to disappear
- bugfix: new scripts weren't being sorted properly
- deprecated: the "Run Script" button
- deprecated: automatic joining of data sets
- deprecated: RunnableScrapingSession for everything but enterprise edition
Public Release 3.0 (01.10.07)
- feature: added a "Find" feature to the scraping session log and script panel.
- feature: the scraping session log can now be limited to a specified number of lines.
- feature: the scraping session log can automatically remain scrolled to the end.
- feature: scripts can now be called from other scripts.
- feature: the database now gets backed up automatically.
- feature: screen-scraper can now be registered in a GUI-less environment.
- feature: tab state is now preserved when moving between objects.
- feature: added context menus for editing commands.
- feature: upgraded Mac interface to be like Windows and Linux.
- feature: added a new library used to write out XML from scripts.
- feature: enhanced firewall handling.
- feature: for new installs, the user is now referred to the tutorials.
- feature: screen-scraper now checks for blocked ports on startup.
- feature: added a method to load and save session state between sessions.
- feature: integrated a new HTML renderer.
- feature: objects can now be organized into folders.
- feature: improved "Strip HTML" feature.
- bugfix: fixed an issue related to passing in remote variables containing the ! character.
- bugfix: fixed an issue related to truncated error messages in scripts.
- bugfix: when invoked from the command line with no parameters the "params" variable was coming through as void.
- bugfix: in some cases duplicate scripts were showing up on import.
- bugfix: there was an issue related to saving while a command line instance was running.
- bugfix: fixed an issue in the proxy related to URL's containing multiple adjacent slash characters.
- bugfix: in some cases the database was closing prematurely.
- bugfix: fixed an issue related to repainting after an extractor pattern was added.
- bugfix: the "breakpoint" window wasn't always updating properly.
- bugfix: addressed issues related to database corruption.
- bugfix: fixed a bug related to tildes in URL's.
- bugfix: made multiple fixes related to international character sets and non-ASCII characters.
- bugfix: fixed a few issues related to running screen-scraper in various modes simultaneously.
Public Release 2.7.2 (03.24.06)
- bugfix: updated the http-client library to accept all SSL certificates.
- bugfix: in certain situations the database was getting closed prematurely when screen-scraper was invoked from the command line.
Public Release 2.7 (03.08.06)
- feature: screen-scraper can now generate RSS feeds from scraped data.
- feature: added session.addToSessionVariable method.
- feature: log messages have been enhanced and clarified.
- feature: all of screen-scraper's ports are now settable in the properties file.
- feature: the web server can now be disabled.
- feature: because of a bug in the third-party library that handles the VBScript engine we included a warning in screen-scraper when using VBScript.
- bugfix: hot swapping scraping sessions and scripts has been improved.
- bugfix: the server can now be run via the shell scripts on more recent versions of Mac OS X.
- bugfix: a few fixes were made to increase database robustness.
Public Release 2.6 (11.01.05)
- feature: international character sets are now supported.
- feature: files can be uploaded within scrapeable files.
- feature: added scrapeableFile.saveFileOnRequest, which allows for binary files to be downloaded via POST requests.
- feature: added session.reformatDate, which allows for extracted dates to be reformatted.
- bugfix: fixed bugs where harmless SQL errors were being generated.
- bugfix: under certain circumstances errors would occur when proxying binary files.
Public Release 2.5 (08.02.05)
- feature: automatic hot swap from the "import" folder on start-up
- feature: scripts can be stopped mid-stream
- feature: tidying settable on a scrapeable file level
- feature: external proxy settable on a scraping session level
- feature: workbench, server, and command line can be run simultaneously
- feature: added a system tray icon for the server when running on Windows
- feature: added scrapeableFile.extractData and scrapeableFile.extractOneValue
- feature: added "mappings" feature for extractor pattern tokens
- feature: implemented saving and loading of state
- feature: caching of data sets
- feature: filtering duplicates from data sets
- feature: regular expressions can now be designated from a drop-down list
- feature: HTML can be automatically stripped from extracted data
- feature: requests can be made multiple times for a URL in case of failures
- change: multiple script instances can be deleted at once
- change: text box is highlighted in the "find" dialog box by default
- change: changed highlight color for "find" feature
- change: "last response" is now cleared before exporting
- change: installer now sets working directory and installs COM driver
- change: enhanced dataSet.writeToFile
- change: added "Strict Mode" cookie policy
- change: upgraded some third-party libraries
- change: performed a number of code optimizations
- bugfix: an error message related to help files was being output to the error log
- bugfix: dataset window spawned from "breakpoint" dialog window wasn't getting initial focus
- bugfix: resolved database corruption issues
- bugfix: server now generates logs by default
- bugfix: scrapingSession.downloadFile now makes use of existing cookies
Public Release 2.0 (02.02.05)
- feature: option for disabling log file generation when run as server
- feature: sending email through scripts
- feature: SOAP connection support
- feature: updated look and feel
- feature: button bar for commonly used tasks
- feature: status bar for application messages
- feature: screen-scraper is automatically installed as a service in the professional edition
- change: single "Import..." menu item instead of choosing between scraping sessions or scripts
- change: "Yes to all" on import
- change: merge cookie drop downs menu items in scraping session general tab
- bugfix: new scripts with the same name will get an icremented number
- bugfix: vbscript scripts can no be invoked when in server mode
Public Release 1.5 (09.11.04)
- change: HTTPS is now handled with a temporary secure certificate
- change: Rename gui to workbench
- feature: Cookie handling option in scraping sessions
- feature: .Net connector added
- feature: Local files can now be scraped
- feature: Delete table rows by right-click and pop-up menu
- feature: Edit menu w/ copy, paste, etc. for text boxes
- feature: Allow selection and deletion of multiple HTTP transactions from table
- feature: Undo/redo on text boxes from Edit menu
- feature: Search function in "Last Response" tab
- feature: Script instances can be enabled/disabled
- feature: Save and restore last window size.
- feature: Data sets can be written to a delimited file
- feature: Basic, Digest or NTLM Authentication handling in scraping session
- feature: Hot deploy by copying scraping sessions and independent scripts to import dir
- feature: Breakpoint debugging in scripts
- feature: Extensibility by adding custom jars to the ext dir
- bugfix: Extractor pattern token data is now saved by default when editor window is closed
- bugfix: Confirm overwrite on export
- bugfix: When an error occurs in getting the html page the http code is displayed in the log such as 404, etc.
- bugfix: "Chunked" tranfer encoding now handled properly in proxy server
- bugfix: New scraping sessions and scripts default names will increment
Public Release 1.2 (06.02.04)
- Numerous bug fixes and optimizations
- Sub-extractor patterns
- More flexible cookie handling
- New methods added to built-in screen-scraper objects
Release 1.1.5 (10.01.03)
- Several bug fixes and a few minor feature enhancements.
- Two new tutorials are now available.
Release 1.1 (09.02.03)
- Numerous bug fixes and minor feature enhancements
- Internal scripts can now be written in Interpreted Java, JavaScript,
JScript, Perl, Python, or VBScript.
- The current scrapeable file can now be accessed within a script, also
allowing access to the full data scraped for a page.
- A method can be called to determine if an error occurred while the file was
being requested.
- Scraping sessions can be paused in a script.
- Maximum number of concurrent scraping sessions can be set via a property.
- The connection timeout can now be set via a property.
Relase 1.0 (07.31.03)
- Numerous bug fixes and minor feature enhancements
- Improvements to sever security
- Extracted data can automatically be saved into session variables
- Extracted data can be joined or appended to existing data sets
- Significant improvements to the install procedure
- Imrpovements to documentation
- Self-updating when new versions become available
- Improved usability of running screen-scraper as a server
Release 0.9.5b (06.12.03)
- Various improvements in documentation
- Several bug fixes and minor feature enhancements were made.
- Several optimization and memory leak issues resolved.
- Data set and data record objects can be accessed from remote sources (e.g.
ASP or PHP scripts.
- A lock file now gets generated when screen-scraper starts up in order to
allow only one instance to be run at a time, avoiding potential database
corruption.
- Basic authentication parameters are now associated directly with a
scrapeable file.
Release 0.8.7b (05.27.03)
- Includes several bug fixes and feature enhancements.
- Allows screen-scraper to import and export objects.
- Improved support for external proxies, including those that make use
of NTLM.
Release 0.8.6b (03.04.03)
- Fixes several miscellaneous bugs.
- screen-scraper can now clean up HTML using
JTidy in order to facilitate data extraction.
Release 0.8.5b (02.18.03)
- Fixed a bug in the proxy server that garbled some URL query strings.
Release 0.8.5a (02.08.03)
- screen-scraper now uses HttpClient
(http://jakarta.apache.org/commons/httpclient/) to handle all of the HTTP
transactions, which allows for a broader range of sites to be correctly
scraped.
Release 0.8.4b (01.15.03)
- added ability to invoke screen-scraper from the command line
- added ability for screen-scraper as a server
- creating language bindings for Java, PHP, and COM
- when viewing the last response from scrapeable files HTTP headers are now
displayed and removed depending on whether the content is viewed as text or
HTML
- patterns can be formed by highlighting HTML
- extractor tokens can be created from highlighting HTML
Release 0.8.2b (11.17.02)
- context-sensitive documentation added
- several bug fixes and feature enhancements
- added support for an external proxy server
- added "settings" dialog
Release 0.8b (10.22.02)
|
|