May 15, 2008

My thoughts on small-scale Drupal development to production environments with CVS and Subversion

There has been a boatload of discussion amongst the Drupal community regarding best practices for managing developement, staging and production environments with a Drupal codebase. The reason this is usually a sore subject for many Drupalers lies in Drupal's heavily database dependent site configuration and management. Thus, it becomes more difficult to manage Drupal sites across different development environments with the tools typically used for this.

Software to help us

Many developers are used to managing software codebases with either CVS or Subversion (both are revision control systems). These systems make it easy to manage file-based software releases, rollbacks, development branches, etc. However, because of Drupal's database-usage strategy, managing and moving around Drupal codebases is not as beneficial as it is with other software.

So what exactly are the options we have for managing our sites? Well, I'm going to run through the process by which we are currently managing a live site between development and production (sans-staging). I've also linked to many other articles and theories on this topic at the end of this post.

A few months ago, I wrote a blog post on Painless Drupal revision control with CVS and Subversion on a shared host. That post is an good read for those interested in simply getting up and running with a Drupal codebase utilizing CVS and Subversion for local revision control, as well as easy upgrades from Drupal.org's CVS. While that blog post focuses on simply getting setup, this post will be more geared towards the issues we currently face with that setup, the proposed workarounds, and the strategies we personally implement.

A sample scenario

I'll start off with the site that we're currently in active development with, and have also already launched. The site is currently sneaking by under the radar, and we're going to keep it like that for a while, so we'll refer to the site as 'Project X'.

Project X began life as a simple CVS checkout from Drupal.org onto my local machine. At that same time, I also ran CVS checkouts of all the modules that I knew I would need for the project. When I had that fairly set, I imported the entire project into our corporate Subversion repository. I then deleted the codebase from my local machine, and checked out a working copy of the project back onto my machine. The project is being developed by myself and one other developer, so he checked out a copy to his local machine as well.

Drupal installation

I went about installing Drupal as normal, knowing that I'd be storing development connection settings in our /sites/default/settings.php. This is so when we release the software, we would be more specific with the definition of the settings.php file in /sites/projectx.com/settings.php. With that setup, we can retain the same codebase for both development and production environments. Drupal will look for 'projectx.com' on both servers (dev and prod), but since the development 'servers' are simply our local machines, it will fall back to /sites/default. Within our /sites/default/settings.php, we pointed the database to a MySQL server we run in-house that we can both connect to.

At this point, it should be noted that the codebase we both have checked out is from 'trunk'. We always develop on trunk. That is, of course, until we have a reason to branch off into separate branches. This is a smaller project, however, so we simply build on trunk for now.

Drupal configuration and customization

I go about building the theme within my trunk checkout, committing changes, adding files, etc. Our other developer, we'll call him 'Pete', is hacking away at a new module we're building to take care of some special functionality. He's commiting his changes, too. Every once in a while, we'll tell each other to run updates to grab the latest code from trunk. This is especially important when adding new modules. If you need to add a new module into trunk, download (or CVS checkout) the module into your codebase, then add and commit to trunk. Before you enable the module, tell Pete to run an update on his codebase (he'll probably have no clue what you're talking about). We don't want to go about enabling a module, resulting in the database change, and have Mr. Pete access the site with the now enabled module in the database, but no module files to support it. In fact, I'm not sure really what would happen, perhaps a black hole, probably nothing. Either way, I'll leave it to someone else to find out.

That's pretty much it for developing pre-production. We make our changes, have our fun, build some stuff, etc. The fun times come for when we want to launch the site on our production server.

Before you release your first version, you'll want to setup your settings.php for the production site. Create the directory /sites/projectx.com and copy the 'settings.php' from /sites/default to the directory. Modify the projectx.com settings.php, specifically the $db_url (line 93 for Drupal 6). Set the correct DB connections here to your production database. That'll be it for the settings.php file.

Now you'll need to dump and import the development database into the production database that you've setup. Since this is the first release, you don't need to worry about overwriting anything.

Tagging our first release

Once we've finished developing on our local machines, have duplicated the development database to the production database, and have finished our final commits from both machines to the repo, we're ready to checkout a copy onto the production server. However, before we do that, we should keep in mind our future development patterns. We will surely want to be able to continue developing on trunk while not having to worry about our production codebase. For that, we use 'tags'. Each time we have a software release we feel is ready for production, we release a new version, and switch the production version to use the latest release.

The quickest way to do this is to SSH to the server that hosts your repositories. The following command (svn copy) will copy your current trunk build to your very first tag:

svn copy file:///path/to/repos/project-x/trunk file:///path/to/repos/project-x/tags/REL-1-0-0 -m 'Tagging release 1.0.0'

Once that's set, you're ready to checkout the tagged release to your production server. Head over to the server, and checkout the 1.0.0 release:

svn checkout svn+ssh://user@domain.com/path/to/repos/project-x/tags/REL-1-0-0 working-copy-directory

If you've setup the settings.php correctly, the site should be good to go. That's it for the initial launch. The site's done, right? Wrong.

Post-launch

Now that the site is live and accumulating data, we need to change our development habits. The development database is no longer the 'master', as there have been changes to the production database that we don't want to overwrite with development data. While we haven't devised a brilliant solution for merging development and production data, we've realized that we don't really need to.

When we're ready to begin a new 'development cycle', we clone the production database, and completely dump and rebuild the development database with the production database. I wrote a stupid quick production to development bash script to handle this for us. Much easier than doing it manually, anyways. This is by no means a cutting-edge development process, but it seems the most logical for us. This is a fairly small project that doesn't really warrant some of the more in-depth development environments that I've linked to at the end of this article.

So now that we've cloned the production DB to the development DB, we've got all of the content available to us for testing with. The majority of our development is done in two areas:

Theme development
Custom module development

Theme development is heavily (if not all) file-based, so this development strategy caters well to that. Custom module development is heavily file-based, but can also be heavily database dependent. We find that, while not having a solid development to production database migration process, manually setting up the module in production really isn't that much work. When I first delved into this problem, I wanted a solid, complete and foolproof solution to migrating development database changes to production. Unfortunately, that just isn't available, and once I came to terms with that, I realized I'm not all that upset about it.

If you develop often, and release often, you'll probably agree with me. Surely, if you're building 4 new themes, 20 new modules, installing 6 contrib modules, and expecting to not have to do any work when migrating to production, you're in for a treat. If you're doing that, however, shouldn't you have rolled that into your initial release?

Ah, I digress. So that's our general strategy. So what happens when we're ready to release our new-fangled changes on development?

Releasing upgrades

When we release upgrades to the software, we simply create a new tag. When you're ready to tag the current trunk build as a new release, simply:

svn copy file:///path/to/repos/project-x/trunk file:///path/to/repos/project-x/tags/REL-1-0-1 -m 'Tagging release 1.0.1'

Once you've done that, you're ready to upgrade your production checkout to the latest release. But, how?

We use the 'svn switch' method. Essentially, we're switching a current working copy to a new subversion project URL. Subversion takes care of the changes between those URLs, with the 'switch' command. When ready to release 1.0.1, we head to the production server working copy and:

svn switch svn+ssh://user@domain.com/path/to/repos/project-x/tags/REL-1-0-1

Subversion makes the appropriate changes to the working copy to reflect changes made from REL-1-0-0 and REL-1-0-1. Win.

Managing production filesystem changes

So now we're done, right? Not really. What happens if there are changes to the filesystem on your production server, such as user file uploads, pictures, etc? Let's say Jon uploads a picture of a drunk cat for his profile picture. We want those file changes to be stored in our repository, as well. You might not want to, and if that's the case, you can skip this part. If you do, that's where 'svn merge' comes in handy. The merge command will essentially 'merge' differences between two sources into a working copy.

Before you can merge the changes, you need to commit the appropriate changes you want merged to your tagged release. From the production checkout, run 'svn add' on the files that were added. Then, commit your changes. Be careful to not commit file modifications that you did not specifically want merged.

You'll need to run the merge from a trunk checkout, since you want to merge the changes from a tagged release into trunk. From a working copy of trunk:

svn merge svn+ssh://user@domain.com/path/to/repos/project-x/trunk svn+ssh://user@domain.com/path/to/repos/project-x/tags/REL-1-0-1 --dry-run

You'll note the use of '--dry-run'. Run the command once as a 'dry run' to see the changes before you actually do them. This is very useful. When you're satisfied with the file changes, remove the '--dry-run' and re-run.

With an 'svn status', you'll see the local file modifications to your trunk checkout. If you're still happy, commit the changes to trunk, and you're done.

I always do the merge after (and only directly after) I upgrade the production copy to the latest tagged release. That way, the changes from tag to trunk only include the file changes or additions that occurred on production, and not file changes on trunk.

So that's about it for our entire development lifecycle.

Other options

The above solution will probably only suffice for small-scale Drupal productions. It may or may not be what you're looking for. Fortunately, there are many brilliant minds in the Drupal community, and there are quite a few alternatives for 'development to staging to production lifecycle' solutions: