We’ve been seeing lots of issues with scrapes connecting to HTTPS sites. Some of the errors include
- ssl_error_rx_record_too_long
- An input/output error occurred while connecting to https:// … The message was peer not authenticated.
- javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
The issue came about when the Heartbleed vulnerability necessitated changes to some HTTPS connections—some of types aren’t secure anymore, and new versions have come out. Screen-scraper needed two changes to catch up, and they are:
- Update to use Java 8
- Update of HTTPClient to 4.4
Both of these are pretty large changes, so they aren’t in the stable release yet, however in some cases they are the only option to make a scrape work, therefore here is the instructions to get what you need.