Recently I have had to don the somewhat ill-fitting hat of 'Build Manager'. This is on a significant .Net-based software development project. It has been a couple years since I last worked directly with build scripts. My last few such experiences have been of the Ant and CruiseControl variety, building both Java web applications and sets of web service-enabled components. Some aspects of those builds have been a little unusual including, for example, automated transformations from UML models to WSDL and XSD files using OMG's QVT language. Anyway, this is my very first time automating builds using Microsoft's Team Foundation Server(TFS),Visual Studio(VS), MSBuild, MSTest and Team Build tooling (all 2008 vintage, 2010 being only available as beta at the time). So adorned with my new 'Build Manager' hat, I looked forward to learning more about this important area of software development, and set off on my .Net build adventures.
The project that my build and tools team needs to build is split into the usual architectural or logical layers. The project also uses NHibernate, and this removes the need for an explicit Data Management (DM) layer. Therefore, the team is left with three layers to build:
- the Problem Domain (PD) layer
- the System Integration (SI) layer
- the User Interface (UI) layer
As usual, both the SI and UI layers have compile time dependencies on the PD layer, and as usual, the layers have no other compile-time dependencies between them.
As a developer, I have come to expect a build to do something roughly equivalent to the following sequence of steps:
- Determine the build number for this build (initialize)
- Build each layer:
- PD first (component-build),
- then the dependent SI layer (component-build)
- then the dependent UI layer (component-build)
- Package the results into one or more deployable artifacts (package)
- Deploy to a test environment (deploy)
- Run integration-level tests (integration-test)
- Generate, package and publish various build
results and reports
team's Knowledge Management System (KMS) (publish)
- Announce the availability of yet another
successful build ready
for deployment to some appropriate environment (notify)
When building each of the layers (component-build), I expect a build to do something roughly equivalent to the following steps:
- Prepare the build environment (clean)
- Retrieve the appropriate source code and supporting files from the version control system (get)
- Label the retrieved files with the build
- Compile the source code (compile)
- Generate database schema and data
population scripts if
- Execute unit tests (test)
- Run static code analysis (check)
- Generate API documentation (api-gen)
- Derive a Bill of Materials (bom)
new features, fixed defects, outstanding bugs, etc
Even if the architecture is layered differently or instead split into a number of vertical components instead of layers, the sequences of steps remain essentially the same.
For a build system like this there are a number of things I want parameterized. In addition to building all the layers or components, I want to be able to build each layer or component separately if needed. I also want the ability to build together different labeled versions of the layers or components. For example, I might want to build an older labeled version of UI and SI against a new version of PD to check dependencies have not been broken by a set of changes within the PD layer. Then there is the desire to be able to do a very fast incremental build that starts with the results of a previous build and incorporates any changes made since. This is contrasted with the need to do a thorough build 'from scratch'. Finally there are the developers that want a 'desktop build' that essentially repeats steps 4,5, and 6 from the second list for one or more layers or components so that they can check that their work passes all unit tests before they officially include it 'in the build'.
Build Script Architecture
Now if I was using Ant, I would be thinking along the lines of two Ant build scripts, one for each of the lists above with the first repeatedly invoking the second for each layer or component. Specifics such as directory roots, version control system folders, compiler settings, and so on would be listed in one or more separate property files passed as arguments to the two Ant scripts. The different property settings provide for most of the different build scenarios desired. Then I would look to CruiseControl or equivalent to schedule and manage frequent and regular build runs. For my new project I expected something similar but with the Microsoft Visual Studio Team System (VSTS) tools in the place of Ant and CruiseControl.
Even better, the project had been running for some time and there were build scripts already in place doing most of the pieces. Clearly all I needed to do was fill in the holes, rearrange some of the pieces, a tweak here, a tweak there and done! In the immortal words of Aleksandr Orlov, "Simples!" ... or, at least, that's what I thought! Sometimes I can be so naive!
A Document Markup Language, not a Scripting Language but ...
"If you only have a hammer, you tend to see every problem as a nail." Abraham Maslow
Reading the above, someone might mistake me for a big fan of Ant and
the use of XML as a build scripting language. I most certainly
am not. Therefore, a short digression on the topic.
The w3c explain that XML was, "Originally designed to meet the challenges of large-scale electronic publishing...". In other words, XML was intended to be a document markup language. Unfortunately, because XML has an open, cross-platform, text-based format, people have bent it to fit many other tasks. Some of those tasks have required considerably more bending to fit than others. Defining build scripts in XML is very arguably one such task.
Ant is the original XML-based building tool. It was devised by James Duncan
Davidson, the original author of Tomcat. Presumably inspired by
success of Ant and its .Net cousin Nant, Microsoft's designers
on XML. Of course, they also chose not to make MSBuild compatible with
Ant or Nant.
I do not need to be convinced to look for a better alternative to Make and Perl. My experiences with build systems created with these are not fantastic; more often than not the memories are ones of frustration trying to make sense of incomprehensible build scripts when their author is on vacation. However, I find it hard to understand why the alternative must be XML-based. While Make and Perl have notoriously terse syntax, XML-based build scripts take us to the opposite end of the spectrum. The need for each statement, and each term within a statement, to be formed from either pairs of names inside angled brackets or explicitly named attributes means that the syntax of any language defined in XML is inevitably, horribly verbose. For example, the common targets file supplied with Microsoft Team Build is over 1500 lines long, and this is before any settings, overrides and custom extensions are considered.
Build Script Concepts: Tasks, Targets, and Properties
Despite my strong reservations, XML-based build tools prevail at the moment, and to stand a chance in the current build universe, it is important to understand the core concepts that these tools are built upon.
In essence, these XML-based build-tools are similar. They use XML documents to link together the execution of a number of targets and tasks.
The build-engine, agent, application, or run-time (whatever you want to call MSBuild.exe or Nant.exe) is passed an XML build file that contains a number of targets, or incorporates a number of targets from other referenced XML build files. In addition, the build engine is given the name of a target to start at. With this information, a build engine can perform a build.
The targets in the XML build file are wrappers for tasks. A target
lists zero or more
tasks for the build-engine to perform. The tasks define the actual work
of the build. A task
takes some input (often a set of files) and performs some function on
that input. The detail of the function performed is determined by a
of configuration settings that are also passed as arguments to the
task. Most tasks produce some well-defined output
(often another set of files). For example, a compile task
generally takes a set of source code files and compiles the code
according to a supplied set of compiler settings to produce a set of
files. A good task will also produce regular progress information
indicating how far through its work it is. Finally, in sophisticated
build scenarios, a task may take an execution location as an input
argument specifying on which build server the task is to be executed.
Each build tool comes with a library of typical build tasks; tasks like compile, test and label, for example. The tasks are not described in XML. They are coded in some other language (Java in the case of Ant and usually C# in the case of MS-Build). This can make the actual function of the tasks a little opaque, especially in the case of proprietary products like MSBuild where the source code for the tasks is not made available. Each tool also provides a means to create and add new tasks to the build's repertory. This is because each development team almost inevitably wants the build to do something special for them, and for which there is no suitable task provided in the library; create a schema for a particular product, for example. After openness and cross-platform, this ability to easily add new tasks makes ease of extensibility the next most popular argument given for preferring one of these XML-based tools over more traditional tools such as Make and Perl.
Targets do more than group a number of tasks together.Each target defines a list of other targets that it is dependent upon. In turn, those dependent targets may declare that they are dependent upon other targets and so on. The dependencies from all the targets in a build file form a dependency graph from which, given a starting target, the build engine derives the list and order of target execution. For example, a test target needs compiled code to execute tests against and, therefore, depends on a compile target. In turn, the compile target needs source code to compile and possibly a clean set of directories in which to place the compiled code (.Net assemblies, *.class files, etc). Therefore, the compile target is dependent on targets that retrieve from source control (get) and delete directory contents (clean).
To summarize, the build engine starts with a specified starting
target, determines the list and order of targets from the graph of
the tasks within each of those targets in turn, and finally executes
the tasks specified in the starting target itself if any.
Branching and Looping with Conditions and Properties
Targets and tasks provide a means of executing a non-trivial set of
tasks in a well understood order. This is all we need to execute a
build for a specific situation. Where we need to execute builds for a
set of very similar but not identical scenarios, or we want want to
write generic, reusable build files, targets and tasks alone are not
Therefore, in both MS Build and Ant/Nant, targets and tasks may also define or be enclosed within a condition that causes their execution to be skipped if that condition does not evaluate to true. The evaluation of a target's optional condition element or attribute, typically involves inspecting the values of properties. Properties are simply character-string-based name-value pairs declared, set and reset throughout the build file. Properties may also passed to the build engine as initial (command-line) arguments. Properties typically store:
- the names of files and folders; the root source code directory names for a compile task, for example
- task configuration settings; the compiler settings for a compile task, for example
- flags to control the execution
of targets and tasks; a flag to skip the
running of the unit tests target, for example.
The combination of properties and conditions add the conditional branching capability needed to execute different sets of targets and tasks to fit different situations, or to perform different kinds of build.
In addition to support for conditional execution of targets and tasks, all XML-based build tools also provide for the repeated execution of a task for all the entries in a collection. In Ant/Nant these collections are defined using ordinary properties. MS Build, however, has a special construct for this purpose called an item collection). Regardless of the precise mechanics, the tools all support looping within build scripts.
Providing both conditional branching and looping, bends XML into a scripting language. Instead of being a simple gluing of build tasks into a specific order, the build scripts can, quite quickly, become complex programs written the world's most verbose programming languages; so verbose that it makes difficult to see the wood for the trees much of the time.
For many projects, including my new project, a simple scripting language would be a more appropriate choice than an XML-based build script. However, with considerable investment already made in using MSBuild and Team Build plus the lure of seamless integration with Visual Studio, we decided to stick with what we had.
After all how difficult could it really be? Did I mention that sometimes I can be more than a little naive!
Next, read Introduction to Microsoft Team Build