Screen-Scaper - Capture the Web

URLFetch ASP/COM

Overview

URLFetch is an ASP COM object written in Java that allows for the retrieval of documents using HTTP.

The source code for URLFetch is freely available under the LGPL.

Features include:

We know what you're thinking -- "Java?! It'll be way too slow. It can't scale." We haven't performed any formal benchmark testing on URLFetch, but we're told by people who have compared it to other components that it's quite speedy :) Try it. You'll love it. We offer a full triple-your-money-back guarantee.

You might want to browse the examples of URLFetch in action.

Requirements

Microsoft Virtual Machine You've probably already got it on your server, but if you experience problems with URLFetch you should probably try installing the latest version of the Microsoft Virtual Machine located here or here.


Installation

To install URLFetch follow these steps:

  1. Download the urlfetch.zip file.
  2. Unzip the file.
  3. Optionally move the files urlfetch.dll and urlfetch.class to a different directory (often it's helpful to put them in the directory where most of the other DLLs on your system are contained, such as winnt\system32 on WindowsNT or \windows\system on Windows95/98).
  4. Open a DOS prompt.
  5. Navigate to the directory where the files urlfetch.dll and urlfetch.class are located.
  6. Type regsvr32 urlfetch.dll (if this doesn't work you may need to find the file regsvr32.exe on your computer, copy it to the directory where urlfetch.dll is located, then type regsvr32 urlfetch.dll).

 

Properties

PostData Data to be sent along with a POST request.
Proxy The name of the proxy server to be used.
ProxyPassword The password to the proxy server designated.
NOTE: If you're having trouble getting URLFetch to work with a proxy server try assigning your proxy settings to the network connection on the server (e.g. using Internet Explorer under Tools->Internet Options, Connections tab). URLFetch will use these settings by default.

Methods

SetURL (URL) Sets the URL to be fetched.
SetAuthorization (AuthorizationParameter) Sets the Authorization parameter, allowing access into areas that require BASIC authentication.
SetHeader (Key, Value) Sets an HTTP header.
SetIsBinary (IsBinary) Indicates whether the file to be retrieved is a binary file.
SaveAsFile (FilePath) Saves the fetched data as a file.
SetConnectionTimeout (ConnectionTimeoutInterval) Sets the amount of time in milliseconds that can pass before the connection should time out.
SetContentReceivedTimeout (ContentReceivedTimeoutInterval) Sets the amount of time in milliseconds that can pass before content retrieval should time out.

Functions

Fetch() Fetches the URL that has been set using the URL property.  Returns a string containing the result of the request.
GetBinary() Gets data that was retrieved as a result of a binary fetch.
GetHeaders() Retrieves all HTTP headers sent after an HTTP request.  Returns a two-dimensional array of strings.
GetHeader(HeaderName) Retrieves a single HTTP header given the name of the header.

 

Properties

PostData

Data to be sent along with a POST request.  Parameters should be delineated by a &, as in the example.

URLFetchObj.PostData = "param1=foo&param2=bar"

Proxy

The name of the proxy server to be used by URLFetch, including the port number, if needed.

URLFetchObj.Proxy = "myproxy.net:3268"

ProxyPassword

The username and password required for the proxy server designated. Note that this is not always needed.

URLFetchObj.ProxyPassword = "uname:passwd"
NOTE: If you're having trouble getting URLFetch to work with a proxy server try assigning your proxy settings to the network connection on the server (e.g. using Internet Explorer under Tools->Internet Options, Connections tab). URLFetch will use these settings by default.

Methods

SetURL(URL)

Sets the URL that will be retrieved when the Fetch function is called.

URLFetchObj.SetURL("http://www.yahoo.com/")

SetAuthorization(AuthorizationParameter)

Sets the Authorization parameter, allowing access into areas that require authentication.  The parameter passed to the method should take the form username:passowrd, as in the example.  The URLFetch object will properly encode the parameter in base64 format.

URLFetchObj.SetAuthorization("foo:bar")

SetHeader(Key, Value)

Sets an HTTP header.  This method would be used for such things as setting cookies and MIME types.

Call URLFetchObj.SetHeader("Cookie", _ "ID=foo; expires=Fri, 20-Dec-2002 15:00:00 GMT; domain=.bar.com; path=/")

SetIsBinary(IsBinary)

Sets a flag that indicates whether the file to be retrieved is binary. Valid values are true and false.

URLFetchObj.SetIsBinary( true )

SaveAsFile(FilePath)

Saves the data that was previously retrieved from a Fetch as a file.

URLFetchObj.SaveAsFile( "C:\Inetpub\wwwroot\images\foo.gif" )

SetConnectionTimeout(ConnectionTimeoutInterval)

Indicates the amount of time in milliseconds that can pass before the connection to the remote server will time out.

URLFetchObj.SetConnectionTimeout( 5000 )

SetContentReceivedTimeout(ContentReceivedTimeoutInterval)

Indicates the amount of time in milliseconds that can pass while content is being received from the remote server before a timeout will occur.

URLFetchObj.SetContentReceivedTimeout( 5000 )

Functions

Fetch()

Fetches the URL that has been set using the URL property.  Returns a string containing the result of the request. NOTE: If you used the SetIsBinary method to designate the file to be retrieved as a binary file calling Fetch will simply return the string "binary". Use the GetBinary function to retrieve the result of a binary fetch.

HTTPResponse = URLFetchObj.Fetch()

GetBinary()

If you use the SetIsBinary method to designate the file to be retrieved as binary, you need to use this function to get the data that was fetched. It returns an array of bytes, and works nicely with the Response.BinaryWrite method.

binaryResult = URLFetchObj.GetBinary()

GetHeaders()

Retrieves all HTTP headers sent by the remote server after an HTTP request has been made to it.  Returns a two-dimensional array of strings.

HTTPHeaders = URLFetchObj.GetHeaders()

GetHeader(HeaderName)

Pass it the name of an HTTP header and it will return its value for the document just fetched. Note that this doesn't correlate to SetHeader in that it will not retrieve the value of the header you just set; rather, it retrieves the value of the HTTP header from the document you just retrieved.

HTTPHeader = URLFetchObj.GetHeader( "Content-Type" )