How we use version control

Any reasonably-sized software development project benefits greatly from some type of version control system, such as CVS, Subversion, or Git.  Internally we use Subversion, and I thought it might be helpful to share a bit how we go about it.  What I describe here is primarily applicable to a project where you have many scrapes being developed by multiple developers, but we even use Subversion for small projects handled by a single developer.

Each developer on a project will have his own instance of screen-scraper, but may be using some scraping sessions and scripts that are also used by other developers.  Generally speaking, though, a given developer is in charge of a certain set of scraping sessions, and we have a series of general scripts that might be used by all developers.  These general scripts can be edited by anyone, but when edits are made everyone needs to be notified so that they can update their own instances of screen-scraper with the latest scripts.  Each time a new scraping session is created or an existing scraping session is modified, it gets exported then committed to the repository.  This isn’t quite as automated as some IDE’s allow, so developers need to be conscientious of their work so that the export and commit at the appropriate times.

We often also make use of debug scripts, which each developer will generally cater to his own work.  It’s likely that he won’t want these scripts overwritten by  those of other developers, so for each of these scripts he need only un-check the “Overwrite this script on import” box in the workbench to protect a such a script.

We also typically keep a separate folder in our version control repository for the scripts that are general to a series of scraping sessions.  It’s possible that a particular developer has a slightly out-dated script, and when he exports that script may go with the scraping session.  To keep it from getting imported into a production environment we’ll copy all of the general scripts (which are always kept current) into screen-scraper’s “import” folder along with the scraping session(s) to be deployed.  screen-scraper will always import scraping sessions first, then scripts.  That way you can guarantee that the current scripts don’t get overwritten.

Because screen-scraper doesn’t use a purely file-based approach to persist its objects, version control can require another step or two beyond what you’d normally find in a modern-day IDE.  Our experience has been, though, that once developers get accustomed to it it’s not too burdensome.  That said, we have plans in the near future to add features that will make working with version control systems even easier with screen-scraper.

Leave a Comment