Added sutil.getRandomUserAgent and sutil.getRandomReferer.
Added IDE style completions. Two new properties are needed for this to work: ShowVariableCompletionsAt=2 (this is the number of characters that must be typed before a completion list appears) and GenericCompletions=true (this sets a flag indicating that generic completions should be used).
Added session.getCurrentStack (a basic method to get the stack).
Added scrapeableFile.applyXPathExpression and sutil.applyXPathExpression.
Added dataSet.size, which is equivalent to dataSet.getNumDataRecords.
Now nulling session variables for appropriate extractor pattern tokens after each extractor pattern match instead of after the pattern has been applied.
Fixed a bug where the HTTP connection pool was getting shut down prematurely.
Fixed a bug related to the previous change to null session variables.
Fixed a bug such that a scrapeable session ID is now being generated even for scraping sessions that will run in the future.
Fixed a bug where nodes in the tree weren't being highlighted correctly.
Scrapeable files can now be added via a URL.
If the DatabasePort and WebServerShutdownPort properties are omitted from the screen-scraper.properties file they'll now be automatically set to the value of an open port.
The ProxyPort will now only be tested and used when screen-scraper is running in server mode if the AllowProxyScripting is set to true.
Added a "Load Response from Clipboard" button to the scrapeable file panel.
Updated BeanShell to the latest version, disabling unstable Windows scripting in the process (e.g., VBScript).
sutil.makeGETRequest and sutil.makeHEADRequest now use proxy settings from the corresponding scraping session.
Temporarily rolled back to the previous version of BeanShell because of a bug.
Upgraded Bean Shell to the latest version.
Searches within a proxy session now include notes.
Fixed an issue that would cause the workbench to freeze when the breakpiont window was up.
Now using global proxy settings if no session proxy settings are found.
Improved cookie handling in the proxy server.
Fixed a bug that would cause a proxy session to not be completely saved.
Fixed a bug that would cause the proxy to misbehave when filtering out less-useful transactions.
Now decoding parameters when adding a scrapeable file from a URL.
The request entity for a scrapeable file can now be set in the workbench.
Fixed a bug where scraping session nodes in the tree were getting collapsed incorrectly.
Updated web server to use Jetty.
Fixed a bug related to generating scrapeable files from a proxy transaction.
Fixed a bug related to adding jar files from the ext folder.
Fixed a bug related to redirects within sutil.makeGETRequest.
Fixed a problem with the scraping server not starting up.
Fixed an issue with the web server on Windows.
If request entity text box is blanked now setting the value to null.
Errors will no longer be thrown if a scraping session has already been stopped.
Added sutil.makeGETRequestUseSessionProxy. The sutil.makeGETRequest method will now use no proxy.
Fixed an issue related to loading external jar files when running in server mode.
Fixed a bug related to automatic anonymization.
Extractor patterns invoked manually can now be tested on a sub-set of the HTML page.
Upgraded internal GWT libraries.
Prettied up the web UI.
Added machine-readable values to REST interface output.
Now propertly handling en-dash characters in URL's.
Can now handle HTTP responses that send two status lines.
In-line documentation in the script editor improved. Inside screen-scraper's doc folder if a javadoc folder is found containing api documentation it will be made available within the script editor.
Re-enabled SOAP interface.
Now writing an error message to the log when a scraping session import via the SOAP interface fails.
Added a global find feature.
Fixed a bug in RetryPolicy related to scraping files recursively.
The scrapeableFile.addHTTPHeader method is now available in Professional Edition.
Widened the proxy text boxes.
Increased the height of the sub-extractor text panel.
Fixed "When to run" combo box to select the correct value when clicked.
Fixed a bug related to editing long HTTP parameters.
Responses from ss web server now being compressed.
Automatic internal DB backups can be set with ShouldBackUpInternalDB property.
Save button now becoming active when a long HTTP parameter is updated.
Fixed a bug related to history navigation buttons.
Fixed a bug related to removing completed scraping sessions via the web UI.
Added session.clearProxySettings() method.
Fixed a bug related to editing extractor pattern tokens that have the same identifier.
Added mouseover row highlights to web UI.
Impoved stability in multi-threaded scrapes.
Updated the date picker for the web UI.
Now diplaying server time in "Settings" dialog box in web UI.
Added disk space usage to web UI.
Added String scrapeableFile.getRedirectURLs().
Added proxy filters.
Comparing scrapeable file requests and proxy requests now takes into account raw request entities.
Script error line numbers are now hyperlinked.
Fixed a naming issue when copying scrapeable files.
Fixed a bug related to HTML in error messages.
Fixed a bug related to comparing HTTP requests in the workbench.
Fixed a bug in rendering session variables in the web UI.
Fixed disk usage indicator in web UI.
Fixed a bug in how POST params are rendered when comparing HTTP requests.
Fixed a bug related to proxy pools.
Fixed an internal issue related to tracking running scraping sessions.
Now aborting a running scraping session if unable to find a valid proxy while using the proxy pool.
Minor fix to the DataManager.
Error messages are no longer hyperlinked when not running the workbench.
Fixed an issue where one script producing an error would interrupte a series of scripts.
Fixed an SSL issue when running on AIX.
Fixed an issue where scraping some SSL sites would generate an error.
Fixed a bug related to a recent change to how SSL is initialized. Added the "Use only SSL version 3" checkbox under the "Advanced" tab for a scraping session.
Fixed a couple of bugs related to a fix in the previous build.
One more bug fix related to the recent SSL changes.
Altering external proxy settings for a proxy session now take effects when restarting the proxy session.
sutil.sendMail now supports alternate content types.
Updated password fields to obscure text.
Update HttpClient and NTLM authentication.
Code folding in scripts and last response
Syntax highlighting in last response
Added a runOnAllAttemptsFailed() method to RetryPolicy
Added convenience methods: isRunningInWorkbench(), isRunningFromCommandLine(), isRunningInServer() to session
Improved handling of NTLM proxies
Fixed a thread blocking issue when invoking a RunnableScrapingSession.
Fixed an issue related to reusing HTTP connections.
Fixed a naming issue related to generating multiple scrapeable files from proxy transactions.
Fixed an issue where imported scripts weren't being properly associated with corresponding objects.
Updated the ss_updater.py file to use the REST interface.
Fixed a concurrency issue related to running the same scraping session multiple times.
Use of $ in the regular expression field for an extractor pattern token is now allowed.
Fixed a bug where invoking a scrapeable file manually was causing the tree in the workbench to malfunction.
Upgraded HttpClient to version 4.3.
Downgraded back to HttpClient 4.2.
Includes experimental code for parsing mailing addresses.
do_lazy_scrape can now be passed as a parameter when running a scraping session via the REST interface.
Added finalize_scrapeable_session action to the REST interface.
Updated a few URL's for remote services.
Fixed a bug related to running a scraping session via the REST interface.
Fixed an issue resolving relative URL's beginning with ?.
Improved an issue related to connections remaining open when using external proxy servers.
Fixed a bug where lazy scrapes would halt prematurely when running from the command line.
Fixed a bug caused by extractor pattern token names containing numbers.
Fixed two bugs related to searching in the log and script pane.
Fixed a minor memory leak.
Fixed two bugs related to finding in text areas.
Fixed a threading issue with anonymous proxies.
Added ConvertHTMLEntitiesByDefault and TrimWhiteSpaceByDefault to screen-scraper.properties.
The KeyManagerFactory algorithm to be used can now be set via the KeyManagerFactory property in the screen-scraper.properties file.
Changed Extractor patterns so they use a thread pool and subextractors can run concurrently. After running various tests with subextractors this resulted on average in about 25% - 50% increase in extraction speed.
Bug fixes and improvements to the address parser.
When using a proxy pool, proxies now start cycling from a random offset rather than always starting at the first proxy in the list.
Modified the extractor pattern name box to have an orange background if the extractor will save the dataset automatically to a session variable (similar to the red sequence number box if it will be run manually).
Bug fix to the form classes when the action tag is missing for the form.
Updates and improvements to the completions provider (the dynamic one, not the old one) so class names are completed and constructors also get completions sooner.
scrapeableFile.setForcedRequestType(ScrapeableFile.RequestType) method (should be called before file is scraped). ScrapeableFile.RequestType is an enum with GET, POST, PUT, HEAD, DELETE, OPTIONS as values. If the method is called to set the request to one of those types, all parameters set as GET in the parameters tab will be appended to the url (like normal) and all parameters set as POST parameters will be used to build the request entity. If there are POST values on a type that doesn't support a request entity, and exception will be thrown.
Added a sutil.convertUTFWhitespace method that takes an input string and converts all the different UTF whitespace characters to a standard space character.
Added a table parser. The table parser takes in a block of HTML representing a table and parses it to a Table object. Cells of the table can be retrieved using the Table.getCell(row, column) method, and this will account for rowspan and colspan cells. Requesting a cell that is spanned by another cell will return an object that shows the same data as the cell that was spanned, but has a flag indicating it is a span cell.
Added / updated an event callback system to ScrapingSession. This allows adding callbacks to be run at various times, similar to scripts, but declared in an initialization script. One useful scenario is to have an initialize script that sets up a connection to a database, and then add a callback to run after all scripts have executed that closes the connection to the database.
Similarly the ScrapeProfiler class can be used in conjunction with the event system to hunt down slow spots in a scrape. It will show the runtime of each script, extractor, scrapeable file, and which session variables are not used. Attaching this to a scrape will cause it to run much slower, but can provide insight when things go wrong.
Added the default_token_config.xml file to resources/conf. This file, if present, allows setting default regular expressions / options for extractor tokens. For example, the included one will set the regular expression to [^<>]* for any all upper-case token names with either < on on the right or > on the left of the token, and will also set the Convert Entities checkbox and Trim checkbox.
Added two values to the properties file to allow access to the web interface from IP addresses that aren't on the allowed list. These properties are WebInterfaceUser and WebInterfacePassword. Note that we are still using http not https, so they should be aware adding those properties makes them slightly more vulnerable since they aren't submitted securely to the server.
Fixed an issue that may cause some instances of screen-scraper to crash on start-up.
Fixed a few issues related to running screen-scraper on Mac OS X Yosemite.
Fixed NoSuchMethodFoundError issues from the previous alpha.
The KeyManagerFactory algorithm to be used can now be set via the KeyManagerFactory property in the screen-scraper.properties file.
Added two new RemoteScrapingSession constructors that allow the socket timeout to be set.
Optimization to the web message code to reduce CPU use.
Added a get_cpu_usage call to the REST interface (same format as get_memory_usage call).
Updated memory use code to be more accurate (requires Java 7).
Fixed a bug in the name parser.
Added a JSONSort class for outputting JSON with keys sorted.
Set the Last Response view in browser to write out the file as UTF-8.
Reworked the event system.
Fixed a bug in the DataManager caused by passing the reference around between threads.
Upgraded to version 4.4 of HttpClient.
Add support for EnableSNIExtension property, which allows screen-scraper to handle some SSL sites.
Cleaned up some issues from the HttpClient 4.4 update.
Added MaximumDisplayedLastResponseLength property (settable in the screen-scraper.properties file), which determines the amount of text that will be displayed under the "Last Response" tab. The value should be an integer, and the default value for the property is 100000.
Fixed a bug with the event system that would have made basic edition not work.
Updated the address parser (now uses the list from the post office for valid street name endings) and name parser.
Set the default run time for new scripts to after each pattern match.
Fixed an issue with proxy authorization.
Fixed an issue where cookies were persisting between runs.
Now setting the correct referer header with session.downloadFile.
Updated the name parser.
Fixed an issue that was causing HTTP transactions to load slowly.
Fixed an issue related to the tree not expanding and collapsing correctly when objects are renamed.
Fixed an issue adding a scrapeable file from a URL containing a hash symbol.
Brought back start page.
Added ConvertHTMLEntitiesByDefault and TrimWhiteSpaceByDefault properties to screen-scraper.properties file.
Optimized reading in HTTP responses.
Added "Refresh tree" option to the pop-up menu associated with folders in the tree.
Added the add/paste sub-extractor pattern buttons to the bottom of the panel.
Added ability to set default regular expressions for tokens via resource/conf/default_token_config.xml.
Updated the ss.key file to improve proxying SSL sites.
Made the HTTP client settable (may require Java 7).
Made global find box slightly taller.
Fixed a bug where tree nodes were collapsing incorrectly.
Fixed a bug where vestiges from one scraping session were being transferred to another.
Fixed a bug where the last response panel was being cleared after a scraping session finished.
Fixed a bug that was causing tree nodes to expand with a single click.
Fixed a bug that was causing a scraping session to disappear when a folder was given the same name.
Fixed an issue related to nulling out session variables after each match of an extractor pattern.
Fixed a scrolling issue related to adding a sub-extractor pattern from selected text.
Fixed an issue related to working with RunnableScrapingSession objects in the workbench.
Fixed a bug where the last responses was getting cleared when it shouldn't.
Fixed an issue that caused the tree to behave incorrectly after adding a scrapeable file.
CPU usage is now being displayed in the web interface.
Updated Async Http Client.
Made a fix causing file handles to remain open after a scrape completed.
Public Release 6.0 (03.30.12)
Now allowing circular redirects.
Fixed a bug where the log was being centered incorrectly after a search.
Added preliminary support for using Rackspace as an alternative to Amazon in anonymization service.
Added support for specifying character set in the XmlWriter.
Fixed a backward compatibility issue with extractor pattern tokens.
Fixed an issue that affects editing tokens on some older Macs.
Fixed a bug related to line wrapping in large responses.
Fixed an issue that disallowed creating new scrapeable files.
Now allowing underscores in a URL.
Now allowing back-slashes in a URL.
Fixed an uncommon case of thread deadlock.
Fixed a rare issue with file uploads.
Fixed an issue with extractor patterns getting interrupted.
Fixed an issue where a blank file HTTP parameter was being sent incorrectly.
Fixed an issue with the REST interface where the wrong scrapeable_session_id was being returned.
Fixed an issue where the HTTP connection manager was getting closed prematurely.
Fixes a bug introduced in the previous alpha related to closing the HTTP connection manager.
The < > symbols are now being handled properly in URL's.
The host used for screen-scraper's database can now be set via the DatabaseHost property in the screen-scraper.properties file.
Fixed a bug where a null parameter was causing rendering problems.
Added ability to turn on and off automatic proxy cycling via setAutomaticProxyCycling.
Auto-saving can now be enabled by adding an AutoSaveTime=[Time in seconds] in the screen-scraper.properties file.
Filtered data sets now show up as filtered when using the "Test Pattern" button.
Added SetCharacterSet to .NET driver.
Fixed a bug related to terminating anonymous proxies via the "Settings" dialog box.
Fixed an issue related to long key names in parameters.
Fixed an issue related to designating the character set when exporting scripts.
Fixed an issue related to populating a proxy pool from a file.
Fixed an issue related to Rackspace anonymization.
Now outputting message as a warning when extractor pattern times out.
Script pane no longer scrolls to the top when finding text fails.
The last error message will now always be retained in the Web UI.
Now notifying the user if a scrapeable file is generated from an HTTP transaction that contains a multi-part request, but no file parameters.
Changed icon to something friendlier on database backup pop-up.
Fixed an issue related to resolving relative URL's from extracted data.
Fixed an issue related to reordering columns in the workbench.
Fixed an issue related to truncated server responses.
Fixed the PHP driver to allow carriage returns and line feeds to be passed in the setVariable method.
Now initializing the last response view to the top of the page.
Now displaying recently accessed scripts first in the script instances drop-down list.
Enlarged the scraping session notes field a bit.
Added back and forward buttons to the workbench.
Multiple objects can now be selected in the main tree, allowing them to be deleted, exported, and moved.
Deprecated caching and filtering data sets (can be re-enabled with EnableCachingAndFilteringDataSets property).
Now automatically swapping extractor pattern tokens for embedded variables in certain fields in the workbench (e.g., in the URL field [email protected]@~ is changed to ~#FOO#~).
Added a "Find" button to the "Last Request" tab.
Fixed a bug that was causing the user-agent header to be duplicated.
Fixed a bug where a deleted recent script still shows in the script drop-down list.
Fixed a bug related to multi-exports.
Fixed a bug related to exporting single objects.
Several DataManager fixes.
Fixed a bug related to multi-selection in the tree.
Enabled gzip encoding in the proxy server.
Added saveStateToString and loadStateFromString methods.
Added new methods to script auto-complete feature.
Added executeScriptWithContext method.
Made a fix in the proxy server related to cookies.
Fixed a threading issue related to the REST interface.
Added classes and methods related to decoding images.
Fixed a bug related to use of the "Breakpoint" button with RunnableScrapingSessions.
Added getStatusMessage, setStatusMessage, and appendStatusMessage to the session object, all of which are synonymous with their corresponding "error" methods (e.g., getStatusMessage = getErrorMessage).
In the web UI changed the column "Error Message" to "Status Message".
Added the following methods to the scrapeableFile object: resequenceHTTPParameter( String key, int sequence ), removeHTTPParameter( String key ), addGETHTTPParameter( String key, String value, int sequence ), addGETHTTPParameter( String key, String value ), addPOSTHTTPParameter( String key, String value, int sequence ), addPOSTHTTPParameter( String key, String value )
Made a DataManager fix where child rows weren't getting inserted for duplicate parent rows.
Changed default user agent for newly-created scraping sessions to Internet Explorer 8.
Now saving in a separate thread so that the GUI won't get locked up for large objects.
Added "Always at the end" option to force scripts to run at the end of a scraping session, even if it gets stopped prematurely.
The prompt to save dialog box only shows on exit when a change has actually been made.
Added a keyboard shortcut to the extractor pattern text box such that when text is highlighted and the Control/Command-T key combination is pressed an extractor pattern token will be generated. This is the equivalent of using the corresponding menu item when the right-click pop-up menu is invoked.
Improved error reporting.
Added local script variables to the breakpoint frame.
When in workbench mode screen-scraper will now breakpoint on a script error.
Now displaying a message on the "Last Response" tab when tidying fails.
Fixed a bug related to saving in the last alpha.
DataRecord keys are now sorted when displaying them in the workbench.
Now logging a warning when an extractor pattern token has no regex.
Made a fix related to displaying local variables in the breakpoint window.
Made a fix for "deflate" content encoding.
Fixed a bug related to web servers that use an older version of SSL.
Added the following methods: session.setStopScrapingOnScriptError, session.setStopScrapingOnMaxRequestAttemptsReached, session.setStopScrapingOnExtractorPatternTimeout, scrapeableFile.getMaxRequestAttemptsReached, scrapeableFile.getExtractorPatternTimedOut.
Fixed a bug related to prompting for save upon exit.
Deprecated proxy scripting. Can be re-enabled via the AllowProxyScripting property.
Fixed a minor memory leak in the workbench.
Updated the .NET driver to work with COM-based applications.
Added initial support for memory profiling.
Fixed a bug related to duplicate token editor windows.
Add buttons to wrap text and find within the request/response of a proxy transaction.
Now using %20 instead of + to represent a space character when encoding GET/POST parameters.
Now correctly displaying encoded GET/POST parameters in scrapeable file proxy comparer.
Added search term to the top of the proxy search results window.
Now determining whether or not to save on individual key strokes.
Fixed a bug related to displaying the start page and handling history.
Fixed a bug related to deleting multiple items.
Fixed a few minor memory leaks.
Now stripping internal anchors off of redirect URL's.
Now reloading the page when a scraping session is removed via the web UI.
Fixed a bug in the workbench where the left pane would start out too narrow.
Now including sequence in proxy search results.
Fixed an issue related to renaming scraping sessions.
Added a couple of check boxes to wrap text to the proxy panels.
Made a fix to ensure consistency in line wrapping the last response text box.
Now centering the search result in the proxy.
Fixed text related to the edition to be more consistent.
Fixed a bug related to stopping scraping when an infinite redirect is encountered.
Made a fix related to duplicate proxy session names.
Added sutil.getNumRunnableScrapingSessions and sutil.getNumRunningScrapingSessions.
Modified various methods to be Professionl or Enterprise edition only.
Updated session.downloadFile to use current proxy settings.
Updated tool-tips in script editor to accurately reflect API changes.
Public Release 5.5 (03.30.11)
Extractor Token Delimiters: Removed magic duplication of token delimiters ('[email protected]' and '@~') when a token is edited along side an empty extractor token.
stringToFloat method: Added a stringToFloat method in sutil object.
getLoggingLevel method: Added a getLoggingLevel method in the session object
setLoggingLevel method: Added a setLoggingLevel method in the session object
DataSet Viewer: Added a memory so that it will open in list or grid first based on its state the last time it was open.
Duplicate Scraping Sessions: Renaming a scraping session to the same name as another is no longer permitted when selecting to rename it from the right click menu.
Extractor Pattern Pasting: If you paste a copied extractor pattern that has been deleted it no longer adds a bunch of scripts.
Mac Export: When overwriting a file on export, Mac's second prompt before overwriting was removed. Once is enough.
Regular expression textbox: The enter textbox in the general tab of the extractor token editing window now resizes with the window.
Tidy Fix: Basic edition was stuck with Tidy turned on even when you turned it off. That was been fixed so that tidy responds to the off request.
Window Artifacts: When a pop-up window is closed the work bench will redraw itself to remove artifacts from the pop-up.
Runnable Scraping Session: Fixed error that stopped runnable scraping sessions from running when session was not passed.
Extractor Token Delimiters: Fixed so that when two extractor tokens share a delimiter only the first is recognized as a token.
removeHTTPHeader method: Added the removeHTTPHeader method to the scrapeableFile object.
Regular Expression Hover: When you place your mouse over an extractor token it now displays the regular expression associated with it in a tool tip. There si a delay on it so that it is not overly annoying.
Script Instances Window: Now in the script instances window it specifies if the script is enabled or disabled. The window has also been updated to size itself so that it remove the horizontal scrollbar.
Paste Extractor Pattern: To avoid the issue of having it look like your screen goes blank when you paste an extractor pattern with lots of sub-extractor patterns, when you paste an extractor pattern the screen no longer jumps to the bottom of the new extractor pattern. It now doesn't jump at all.
Folder Delete: Minor fix so that if you delete a folder immediately after importing into it the delete will take effect the first time.
No to All: If you import a scraping session with attached scripts you can now choose the option No to All to have screen-scraper not replace any of the conflicting scripts.
Proxy Port Resets: If you change the port on a running proxy session it will change without having to stop and restart the proxy session.
Paste Extractor Pattern: When you paste an extractor pattern you now jump to the top of the new pattern.
Jython Libraries Updated: The jython libraries that process python were updated from 2.1 to 2.5.1. The standard python libraries were added at lib/jython-lib. The folders lib/ext, lib/jython-lib, and lib/jython-lib/site-packages are now included in python's search path.
No to All Improvement: If a script is not to be overwritten on import the No to All will avoid the warning that you cannot overwrite the script, as you are not trying to.
Secondary Server for Anonymization Service: We have added a secondary server to handle automatic anonymizations. By default you will continue to use the current server, if you would like to change to the other server you can add the property AnonymizationURLPrepend to your screen-scraper.properties file. The only acceptable values are http://anon.screen-scraper.com and http://anon2.screen-scraper.com.
Compare Last Request and Proxy Transaction Window Scrollbar: The scroll bar is particularly added for POST data that gets very long. Instead of disappearing off the bottom of the window a scroll bar will now appear.
Mail Server Settings: The form fields in settings for a mail server were removed in the Professional edition since those methods are Enterprise only.
Regular Expression @ Fixed: In updating one of the extractors got messed up so that it got confused by the at sign (@). It has been set right again.
Regex Stuck on Screen: When you navigated away from screen-scraper the Regex toll tip would get stuck on the screen. The screen now redraws so that it disappears.
Edit Token Option: The edit token option is now only available when the token is a valid token.
Window Sizes: The DataSet and Compare Last Request and Proxy Transaction windows now retain their size from the last time they were open.
NTLM Authentication Refresh: Workbench was retaining information incorrectly from last NTLM request. This caused it to not log in correctly when run again. It now clears the HTTP state between scrapes.
Jython Library Load File: The file created to take care of adding the python libraries did not deploy with the other updates. This caused python scripts to fail entirely. It has now been resolved.
Code Completion: The code completion has been brought up to date with previously undocumented methods as well as the current alpha methods.
REST Interface: Updates were made to the rest interface to facilitate tracking a session and starting a session with passed variables.
Mail Updates: Added support for connecting to a mail server using TLS/SSL. The mail server port can also now be specified.
Logging Updates: The various logging methods can now take Objects and not just Strings.
Mac Fix: With the update of JRE 1.6.0_22 on Mac OS X the "sss" file extension was being trunctaed when exporting.
Anonymization Fix: In some cases anonymous proxies would spawn, but never become available. Added code to handle this situation.
Made a bug fix that arose when available anonymous proxy servers was depleted to zero.
Now disallowing running multiple screen-scraper interfaces simultaneously. For example, previously the screen-scraper workbench could be run concurrently with the server. This ended up causing database corruption in some cases, though, so we're now disallowing it.
When clicking a search result after performing a find in a proxy session the HTTP transactions table will now scroll to the corresponding transaction.
When clicking a search result after performing a find in a proxy session if the associated proxy session isn't visible in the right pane it now will be.
In exporting objects if an XML comment was found in any of the text fields the resulting exported file would contain an invalid sequence of characters.
Fixed a scrolling bug related to displaying script instances associated with extractor patterns.
Removed a log message that was appearing each time a redirect occurred.
screen-scraper will now display a "start page" when the workbench initially launches.
Based on feedback, now allowing running the screen-scraper workbench and server simultaneously by adding the "AllowMultipleSimultaneousInstances" property to the screen-scraper.properties file.
Fixed a bug where screen-scraper would freeze up when very large requests were included in proxy sessions and scrapeable files.
Fixed a bug on Mac OS X where an overwrite prompt was not being given in exporting scraping sessions.
Fixed a message formatting issue in certain script errors.
Fixed an issue with anonymous proxies being terminated externally.
When creating a new extractor pattern token screen-scraper will now attempt to guess the regular expression that should be used.
Fixed a bug related to editing extractor pattern tokens.
Fixed a bug related to highlighting of data records in the last response tab.
Optimized highlighting of data records in the last response tab.
Fixed a bug related to selecting extractor pattern tokens.
Token editor now saves and closes when the return key is hit.
Fixed a bug related to finding script instances.
Updated proxy to use HttpClient 4.
Fixed a bug related to the recent update to the proxy.
Including a UseGlobalExternalProxyForAllScrapingSessions property in the screen-scraper.properties file will now case global proxy settings to apply to all scraping sessions.
Fixed a minor bug related to invalid extractor pattern token names.
Undo in certain text boxes can now be triggered properly via keyboard shortcut on a Mac.
Now notifying the user if there are no matches when the "Highlight Extracted Data" button is pressed.
The "Last Response" tab can now be displayed in a separate window.
Fixed a bug related to the anonymization service.
The DataManager now handles reserved words correctly.
Fixed a bug related to data extraction timeout.
Fixed a bug related to requests being recorded with redirects.
Fixed a bug related to hitting the "Enter" key in the find dialog box.
You can now wrap text in the last request and last response panels.
Rearranged elements on the last response panel so that overlapping shouldn't occur.
The delay on the script auto-complete box can now be set via the "AutoCompleteDelay" property in the "screen-scraper.properties" file.
Rearranged elements in the proxy "Progress panel" so that they don't overlap.
Now dismissing the splash screen before the start page loads.
The name text box is now highlighted when proxy sessions, scraping sessions, and scripts are created.
Adjusted a few visual elements related to proxy sessions so that they resize correctly.
Now filtering out "sitecheck" requests made by Opera.
Table columns in the "HTTP Transactions" table are now being sized correctly even when the table is empty.
Fixed a bug where less-than symbols weren't always showing up in the tool-tip for extractor pattern tokens.
Restored the horizontal scroll bar in the last response tab.
Fixed an error that caused screen-scraper to disallow testing extractor patterns.
Fixed a minor bug related to Java keystores.
Fixed a bug related to the data set list view not displaying correctly.
Fixed an issue where anonymous proxy pool would not automatically repopulate when proxies were terminated automatically.
Fixed an issue in Linux where the extractor pattern panel was a bit too large.
Fixed an issue in Linux where the scraping session log panel was a bit too large.
Altered how character sets are handled in terms of how specifically set character sets override more global settings.
Long parameter values can now be edited in a separate text box.
Fixed an issue with extractor pattern token tooltips.
Fixed an issue with sub-extractor panels not sequencing after deletion.
screen-scraper will now display an error message when an invalid regular expression is entered for an extractor pattern token.
Fixed an issue with resizing the proxy transaction compare window.
Fixed a bug where the paste sub-extractor pattern was becoming enabled after a sub-extractor pattern had been deleted.
Fixed a bug where data record highlighting wouldn't work correctly with very large HTML pages.
Fixed a bug where parameters sent in a multi-part request were causing invalid responses.
The position of the divider bar on the split pane for proxy sessions is now retained.
Numeric columns in tables are now rendered using the default font.
Fixed a minor bug related to editing extractor pattern tokens.
No longer truncating HTML in the "Last Response" tab.
Minor bug fix to the DataManager.
Fixed a bug related to setting the originator edition when exporting.
The cursor now returns to normal after attempting to highlight data records for a pattern that doesn't match.
Fixed a bug where data records were not highlighting in the last response the very first time.
Fixed an issue where scrollbars weren't appearing in the proxy/scrapeable file compare window.
Now displaying an error message when applying invalid extractor patterns.
Fixed a minor memory leak in the workbench.
Fixed a bug related to highlighting data records.
Fixed a bug where the scrapeable file view wasn't updating correctly in some cases.
The "Generate scrapeable files in..." menu will now scroll when it contains many items.
The term "sutil" will now appear in blue in the script editor.
When exporting an object it will now be selected in the tree.
Fixed a bug related to the proxy / scrapeable file comparer.
Updated the PHP driver so that it now detects when it can't connect to the screen-scraper server.
The "Runnable" tab in the web interface will now show the most recently run instance of a particular scraping session.
Enhanced error message when screen-scraper is inhibited by a local firewall.
Fixed a link to sub-extractor pattern help.
Public Release 5.0 (06.30.10)
feature: added REST interface
feature: can now filter out less useful proxy transactions
feature: added DataManager to facilitate saving data to a database
feature: generate multiple scrapeable files from proxy session
feature: made button bar persistent for extractor patterns
feature: retained number of lines to display for scraping session log between sessions
feature: updated scrapeable file icons to indicate when they are and are not invoked in sequence
feature: added a delete option for scraping sessions to web interface
feature: enhanced data set viewer with list view and colored tokens
feature: improved script error messages
feature: added a method to allow HTTP parameters to be removed from scrapeable files
feature: added logging levels to scraping session
feature: added ability to compare request in scrapeable file with transaction in proxy session
feature: enhanced breakpoint window to show more information, such as current script and number of scripts on the stack
feature: added syntax highlighting to extractor pattern pane
feature: added ability to pause/breakpoint a scraping session with a button
feature: extracted data can now be highlighted in last response tab
feature: pane now scrolls down when an extractor pattern is added
feature: character set can now be determined on a scraping session and scrapeable file level
feature: added ability to limit length of response for a scrapeable file
feature: enhanced handling of database backups over time
feature: can now add more session variables to a scheduled scraping session in the web interface
feature: added ability to clear completed scraping sessions from web interface
feature: enhanced a few default regular expressions
feature: properties file can now be reloaded from the web interface
feature: can now copy and paste sub-extractor patterns
feature: added ability to trim white space from extracted data
feature: added a couple of new options to invoking scripts from an extractor pattern
feature: added sutil to handle more general methods
feature: provided a way to null out session variables for tokens that didn't match
feature: provided a way to save data sets without appending to an existing data set