Getting legacy PHP projects into shape

Door: Thijs Zumbrink
20-10-2016 17:01

Schalpoen and some related sites have recently undergone some much needed infrastructure changes! This article explains the transition to the new situation, or: how to get legacy PHP mess into shape.

Before we start out, a quick overview of the "before" situation:
• No version management
• View changes in production
• Written in a text editor (Geany)
• PHP 5.2 compatible code
• No dependencies or build tools at all
• Shared hosting
• Upload manually via FTP

The "after" looks quite different:
• Version management in Git
• View changes locally
• Development in Netbeans
• Code runs on PHP 7
• Composer for dependency management
• Hosting on a private server
• Automated build and deployment over SSH using Phing

There are various other advantages as well: it's now trivial to run the project locally and start building a test suite. I already do new projects in this fashion, but now all legacy is modernized as well. How did I get there?

Version management

The ideal place to start. I've grown so accustomed to version management during development that it seems unthinkable to go without. I've dumped all old code from the old FTP server, cleaned it up (remove runtime artifacts etc.) and put it into a git repository.

For good measure I moved the source files from (project)/ into (project)/app/web/ for future refactoring and additions. In the original setup all files were located in or below the web root, as the hosting company did not support placing files above the web root. Now that finally changes, increasing security, but also giving a good place for development files like a test suite.

The repo is pushed to GitLab for cloud storage with all its benefits. Once the code is cleaned up (and not a moment before!) I might open it up on GitHub as well, but that is a future goal.

Local execution

There is nothing more stressful than not being able to see what your code does until after you put it live. At least when it's not working. Without a proper history it becomes a rush to uncomment the last change you did and hope that was indeed the culprit! This stemmed from me not having a local execution environment set up.

To solve this it seems logical to set up a local web server and PHP installation, but actually the PHP installation is already enough. As of version 5.4 PHP has a built-in web server that works well for development purposes. No more deploying faulty code!

An added benefit of working locally is that the project tends to be more portable, since it will always have to run on at least two locations.

Development environment

As said before, the code was all written in a text editor (Geany). This was in a dark past where I didn't see enough merit in using an IDE and the available options for PHP were still limited.

I now use Netbeans. It's very convenient to launch the built-in web server, it has debugging, testing integration, Phing integration for building and deployment, composer integration for dependency management, and much more! But also the simpler features like code completion, "go to declaration" and live syntax checking are extremely helpful.

Even though Netbeans 8 currently has no official PHP 7 support, I use the nightly build of Netbeans 9 which does. And it works perfectly!

Upgrade code to PHP 7

I do my development on Arch Linux, where the system packages are always up to date with the latest upstream versions. This means that using PHP 7 (at least when installed via the package manager) is not optional, which is probably a good thing.

The original PHP 5.2 code (some parts even PHP 4!), though surprisingly compatible, did have some issues. Most of these related to notices about calling non-static functions statically, missing array keys and things like that.

Among the larger issues is the mysql_*() removal, which is not hard to solve by using mysqli_*(). For my projects I changed to PostgreSQL, since that already ran on the private server, but some projects will probably see a change to SQLite soon to enhance portability and ease of setup.

Composer

Since every PHP project uses Composer (and those that don't, should!) I added a simple composer.json file to every project. Even though the production dependencies for each project are empty, by already integrating this into the workflow future refactoring becomes really easy.

I also committed composer.lock to the repository. I follow the rule: "for libraries, do not commit composer.lock, for projects, do commit composer.lock." The reason is that it makes sense for libraries to be as flexible as possible, and for mistakes to surface early. For projects however, it makes most sense to lock all dependencies to specific versions and upgrade manually.

Actually each project has one development dependency: Phing. More on that later.

Private hosting

Silvan and I used a shared hosting provider since 2006 because it was cheap and easy. However, prices steadily rose in the past 10 years until we finally decided to leave when it went too far.

Now I use TransIP to manage the domain name and e-mail forwarding, and DigitalOcean for the private server a.k.a. droplet. The droplet (set up earlier by Martina) runs Ubuntu server with Nginx and PostgreSQL and was already home to several Ruby projects.

Each project now houses its own nginx.conf file that manages the web server configuration. Among other things, it points to the socket of a running PHP-FPM server. This file is included in the deployment phase where it is the target of a symlink, so no config hassle is needed. (Apart from a "reload" command to nginx.)

The droplet still runs PHP 5 until we upgrade the OS, but it's not hard to keep code with legacy roots compatible with both PHP 5 and 7.

Build and deployment automation

Finally this is perhaps the biggest quality of life improvement. Instead of uploading the files separately to the FTP server, making mistakes in the process, deployments are now complete, automatic and atomic. All secret configuration is kept in an unversioned config file.

To this end I set up a Phing build script that contains two tasks: "build" and "deploy".

"build" copies (project)/app, (project)/config and any other relevant source files into the build directory. Then it copies composer.json and composer.lock there, runs composer in production mode, and removes the .json and .lock files again.

Later this will also perform other tasks such as testing and CSS/JS preprocessing+minifying.

"deploy" takes this result and copies it to the server using SSH and SCP. It places the files in (target)/deploying. Once that is finished, (target)/current is renamed to (target)/previous and (target)/deploying is renamed to (target)/current.

This way it's always easy to roll back when deployment goes wrong. It also means the deployment is (more or less) atomic, since the renames happen fast. Alternatively a symlink could be used, which makes it easier to have more than one "previous" version by simply naming the "deploying" directory after something unique such as a time stamp, and pointing the symlink to it once the transfer is complete.

Any data shared between deployments (such as user uploads) are stored in (target)/shared and not touched by the script. The path to shared data is managed via the application config. Logs are written to (target)/log and are therefore also independent of deployments.

Conclusion

While it may feel too "enterprisey" for small projects, it's not that complex. If anything, it eliminates mistakes and opens the door for true refactoring. By automating tasks, you keep a sort of "manual" for your future self on how things work, and of course version management helps a great deal with that too.

See examples of typical files used in the project and build config:
• composer.json
• build.xml
• build.properties
• nginx.conf

Future goals

Some things not addressed here but certainly worth a look:
• Managing multiple PHP versions with phpenv
• Using Docker or Vagrant to make the project truly portable
• Continuous integration and continuous deployment, for example with GitLab

But immediate goals for my projects specifically:
• Adding high-level automated tests
• Restructuring code using PSR-4
• Abstracting database access to more easily swap implementations
• Virtually everything else mentioned in mlaphp
• Make code open source