screen-scraper Helps Power Oracle OpenWorld Search

Posted in Miscellaneous on 09.21.10 by Todd Wilson

Not to toot our own horn (okay, we will), but our very own screen-scraper software is helping to power the search feature for the currently-running Oracle OpenWorld conference.  From the OpenWorld home page, try a search in the box found in the upper-right corner (try something like “SES”).  The search results you see where scraped from their content catalog, keynotes, and blog postings, then aggregated and enriched with information like spatial data (e.g., for demos you can click the location to see on a map exactly where it occurs).  The excellent search interface is provided by Oracle Secure Enterprise Search, with which screen-scraper has been integrated.

This is actually a great example of the power of screen-scraping.  Take information from various web sources, dump them all into a single database, then correlate and enrich the information in a searchable interface.  It’s a powerful thing to take disparate pieces and sum them into something that’s much greater than the individual parts.

Version 5.0 of screen-scraper released! And it’s on sale!!!

Posted in Miscellaneous, Updates on 07.06.10 by Todd Wilson

Okay, that was probably too many exclamation points in the title.  It’s with good reason, though.  Version 5.0 represents a major upgrade in screen-scraper’s functionality (take a glance at the release notes to see what I mean).  Not only have we made all kinds of bug fixes, but there are lots of enhancements to the user interface as well as completely new functionality.

Along with this release we’ve also done some significant revamping of our docs, which we’ll continue to do.  We want to make sure screen-scraper is easier to learn for the beginner and quicker to use for the seasoned veteran.

And on top of all of that it’s on sale!  Until August 15 of this year you can get the Enterprise Edition for $800 off and the Professional Edition for $150 off.  We’ll be raising prices back up after that date, so get it while it’s hot.  Interested in purchasing?  Click here to be taken to our registration page.

Comparison of Web-scraping software

Posted in Miscellaneous on 06.28.10 by Todd Wilson

View full-size

Exporting & importing scraping sessions in 4.5.42a

Posted in Miscellaneous, Tips on 04.06.10 by Todd Wilson

We try hard to maintain backward compatibility as much as possible, but unfortunately it can’t always be done.  If you recently upgraded to 4.5.42a you may have noticed that scraping sessions that are exported from that version don’t import correctly into an alpha version prior to it.  This was a result of the alterations to the “tidy HTML” functionality that were implemented in that version.  As such, this is one case of backward-compatibility where you’re going to have to be careful.  As of this version (and later versions) if you export scraping sessions from screen-scraper you should only import them into instances of screen-scraper also running version 4.5.42a or later.  Unfortunately, this is one case where it was impossible to maintain the compatibility with older versions, so please take note.

One-day only 50% off sale!

Posted in Miscellaneous on 04.10.09 by Todd Wilson

Yesterday I opened a fortune cookie that said, “Do something unusual tomorrow.”  I thought about sky-diving or going the whole day blind-folded, but instead opted for something even crazier–sell screen-scraper for half price!  If you’re on the fence about purchasing now might be a good time to take the plunge.  I don’t see us doing this again any time soon.  The sale will last until April 11, 2009 at 11:00 a.m. Mountain time.

First video tutorial

Posted in Miscellaneous on 03.25.09 by Todd Wilson

We’ve had people asking for this for quite a while, and have finally gotten to it.  We now have a video version of our first tutorial, accessible from the tutorial itself:

It isn’t perfect, but I think it’s a pretty good first version (and definitely better than what we had previously).  We’re hoping to get some feedback, then will likely do another version soon based on that feedback.  Feel free to give it a try and let us know what you think.

Iowa Workforce Development Uses Screen-Scraper to Enhance Job Search

Posted in Miscellaneous on 10.27.08 by Todd Wilson

One of our eagle-eyed developers recently spotted a couple of blog postings by Bronwyn Mauldin (here and here) wherein she discusses Iowa Workforce Development’s use of our screen-scraping technology in building out their job board.  Bronwyn is a great writer and a consultant in the workforce development industry.  After reading Bronwyn’s postings we decided to contact the Iowa office ourselves to catch up on how things have been going for them.  It makes a great story as to how screen-scraping technology is being used in a very effective way.  We decided to make a press release on it, which you can find here:

Iowa Workforce Development Uses Screen-Scraper to Enhance Job Search

How to stop phpBB spam

Posted in Miscellaneous, Tips on 01.02.07 by Todd Wilson

Well, I sure wish someone would have told us about this a while ago, so I’m doing the world a favor and talking about it here. Hopefully this blog posting gets picked up by Google so that others who are new to phpBB can learn how to stop spam up front.

We’ve been battling spam on our phpBB forum for I don’t know how long. The forum software works fine, but it’s so widespread that it seems to be one of the primary targets for forum spammers. After monkeying around with the thing installing mods and making manual changes, we finally hit this mod: Stop Spambot Registration. Once installed, the spam stopped. Amazing.

Now, obviously your mileage may vary with this one. We’ve also tried a bunch of other mods, so it’s possible that some of our mods are helping, but the Stop Spambot Registration was the key for us. If you find that you need more firepower beyond that mod, I’d recommend trying others on the phpBB Security-Related MODs page that relate to spam.

By the way, just one plea to the phpBB folks–please consider building spam control into the base install of the software. You know people are targeting you, so why not give your users some defense out of the box?


Well, I declared victory a bit prematurely with that last posting. We got a bit more spam after I installed the mod I mentioned, so I installed one more: spamwords. It seems to work fairly well. My only complaint is that it only allows you to designate words, and not phrases, as indicators of spam.

I should also mention one other change we made early on that stopped a lot of the spam–we deleted the guest user account. This is the user in the database that has an ID of -1. I searched and searched for a way to disable guest posting, to no avail. With the guest account deleted people see an error message if they explicitly log out, but at least it prevents spam from non-registered posters.

Using screen-scraper to automatically test embedded devices

Posted in Miscellaneous, Thoughts on 09.12.06 by Todd Wilson

A while back I flew out to Huntsville, AL to work with a government contractor company on automating the testing of embedded devices. To this day I’m not entirely sure what these little machines did, but they each had a web interface that needed testing (much like that of a wireless router, if you’ve worked with those before). This isn’t the most common usage for screen-scraper, but it turned out to be just what they needed.

I worked closely with Greg Chapman, one of their engineers, and he recently wrote an article on the experience entitled Testing aerospace UUTs leads to Web solution. Greg’s a smart guy, and has continued to use screen-scraper in ways that I wouldn’t have even considered.

It’s gratifying to see screen-scraper used in so many different ways, but it’s interesting that it’s versatility has almost been a curse at times to us. Our software can be used for all kinds of purposes, but we’re finding that, from a business standpoint, we’re often better off narrowing our focus to very specific applications. As one marketing expert we consulted with put it, “You guys have plastic.” Plastic is incredibly useful, but it gains value as you craft it into something with a specific purpose. I’m planning on blogging about this idea more later, but it’s interesting to consider the pros and cons of a general-purpose tool, like screen-scraper.

Developing software by the 15% rule

Posted in Miscellaneous, Thoughts on 08.24.06 by Todd Wilson

Writing software on a consulting basis can often be a losing proposition for developers or clients or both. There are too many things that can go wrong, and that ultimately translates into loss of time and money. The “15% rule” we’ve come up with is intended to create a win-win situation for both parties (or at least make it fair for everyone). Clients generally get what they want, and development shops make a fair profit. It’s not a perfect solution, but so far it seems to be working for us.

This may come as a surprise to some, but we make very little money selling software licenses. The vast majority of our revenue comes through consulting services–writing code for hire. Having now done this for several years, we’ve learned some hard lessons. On a few projects the lessons were so hard we actually lost money.

A few months ago I put together somewhat of a manifesto-type document intended to address the difficulties we’ve faced in developing software for clients. I’m pleased to say that it’s made a noticeable difference so far for us. My hope is that this blog entry will be read by others who develop software on a consulting basis, so that they can learn these lessons the easy way rather than the way we learned them.

What follows in this article is a summary of one of the main principles we now follow in developing software–the 15% rule. If you’d like, you’re welcome to read the full “Our Approach to Software Development” document.

For the impatient, the 15% rule goes like this…

Before undertaking a development project we create a statement of work (which acts as a contract and a specification) that outlines what we’ll do, how many hours it will require, and how much it will cost the client. As part of the contract we commit to invest up to the amount of time outlined in the document plus 15%. That is, if the statement of work says that the project will take us 100 hours to complete, we’ll spend up to 115 hours (but no more). As to where-fores and why-tos on how this works, read on.

Those that have developed software for hire know that the end product almost never ends up exactly as the client had pictured. There are invariably tweaks that will need to be made (that may or may not have been discussed up front) in order to get the thing to at least resemble what the client has in mind. And, yes, this can happen even if you spend hours upon hours fine tuning the specification to reflect the client’s wishes. Additionally, technical issues can crop up that weren’t anticipated by the programming team. In theory, the better the programming team the less likely this should be, but it doesn’t always end up that way (Microsoft’s Vista operating system is a sterling example). These two factors, among others, equate to the risk that is inherent in the project. Something isn’t going to go right, and that will almost always mean someone pays or loses more money than originally anticipated. The question is, who should be responsible to account for those extra dollars?

Up until relatively recently, we would shoulder almost all of the risk in our projects. If the app didn’t do what the client had in mind, or if unforeseen technical issues cropped up, it generally came out of our pockets. For the most part it wasn’t a huge problem, but always seemed to have at least some effect (the extreme cases obviously being when we lost money on a project).

This seems kind of unfair, doesn’t it? The risk inherent to the project isn’t necessarily the fault of either party. It’s just there. We didn’t put it there, and neither did the client. As such, it shouldn’t be the case that one party shoulders it all. That’s where the 15% rule comes in.

The 15% rule allows both parties to share the risk. By following this rule, we’re acknowledging that something probably won’t go as either party intended, so we need a buffer to handle the stuff that spills over. By capping it at a specific amount, though, we’re also ensuring that the buffer isn’t so big that it devours the profits of the developers.

For the most part, the clients with whom we’ve used the 15% rule are just fine with it. It is a pretty reasonable arrangement, after all. We have had the occasional party that squirms and wiggles about it, but, in the end, they’ve gone along with it and I think everyone has benefited as a result.

« Newer EntriesPrevious Entries »