ndebug

Saturday, July 4, 2015

Software Builds at EA: Sourcing Code

In my previous post I discussed the general overview of EA's build framework, the one with the imaginative name Framework. Relegated to one of the last paragraphs was a mention of the package server and how it is used for enabling sharing of code across teams.

Since that post was written we've added a new mechanism to Framework to share code by means of specifying its location via a URI in the masterconfig file.

The URI sourcer has two back-ends currently implemented, one for fetching code from Perforce, and another for fetching from NuGet.

Perforce URI's encode the server, depot path, and the specific changelist or explicit head revision specifier of the software to pull down. This last condition is meant to improve build reproducability; anyone with a particular master configuration file will have the same code from which to build, local changes notwithstanding.

Because EA is a very distributed company, with studios located across several continents, and because these packages can be large (multi-gigabyte), Framework keeps an internal mapping of central Perforce servers to studio-specific proxy and edge servers, allowing users at these studios to avoid saturating slow trans-Atlantic and trans-Pacific network connections.

As Microsoft moves to distribute more components via NuGet we were looking to add support to Framework to support interoperation. By specifying a NuGet package name, version, and a URI, Framework will fetch the code from the NuGet mirrors. It then installs the package and wraps any contained DLLs in the Framework package scaffolding. This step is not strictly necessary, we could add the package to the MSVC packages.config and be done with it, but it adds a convenience layer to permit any package in the build hierarchy to express dependencies on the NuGet library as if it was a normal Framework package.

This is but the first step in enabling better distributed development workflows. Being able to pull down source code from someone else's server is handy; being able to fork a sub-component package, make a change, share it with your team and/or send a pull request back to the maintainer, while fitting in this same explicit versioning model that Framework relies on and using backing technologies that handle repositories in the tens or hundreds of gigabytes with no problem (trololol), is going to be a lot of fun. Or awesome. I'm hoping for awesome.

Friday, January 2, 2015

Software Builds at EA: The 5000' View

A couple of days ago this tweet by John Carmack popped up. In case the link ever goes away, he says, "Dealing with all the implicit state in the filesys and environment is probably going to be key to eventually making build systems not suck."

At EA we have spent a lot of time developing build systems that don't suck. They're not perfect but as we develop applications and games targeting platforms from Android to Xbox, plus everything in between, they work incredibly well for us.

Framework

The cornerstone of our build infrastructure is a version of NAnt that was forked eons ago and has undergone essentially a complete rewrite. Although the base functionality is still there, and still used, a lot of effort has been put into establishing it as a base for the higher level components.

NAnt is just one piece of a system called Framework. In addition to NAnt, Framework includes features for managing build configurations, SDKs, and software distribution. There are a few core concepts in Framework that I think have made it much easier to manage the build morass: modules, packages, dependencies, masterconfigs, build configuration management, SDK packages, and the package server.

Module

A module is Framework's representation of an artifact producing build process. At a minimum a module will have a type (static/dynamic library, executable, C# assembly, etc.), a name, and a set of input files. Modules have a number of optional components such as custom compile options, custom preprocessor definitions, custom warning suppression, sets of resource files/include files/object files/libraries, and dependencies.

Package

Packages are a collection of modules. Every package in the Framework world has a version, even if that version is "unversioned". A package contains a description of how to build its modules, its public exports (such as public include directories, build artifacts, etc.), as well as a manifest which allows the package author to include structured metadata that can be inspected by the package server, continuous build/test systems, etc.

Dependencies

Dependency handling is one of Framework's killer features. Dependencies can be made onto other modules in the same package, onto other packages, or onto other modules in other packages. For example I can specify that my module has a dependency on EASTL and the build system will ensure that I am using the correct version, build it if it is not already built, and set up the necessary include and library paths.

Masterconfig

Versioned packages lose their utility if there is no way to enforce the versions being used. The masterconfig file is the bill of materials for the application and specifies the version of each dependency, each SDK, and each configuration component.

One of NAnt's script concepts is the property. Essentially a global variable, properties are used for tweaking the state of the build system. Some are set in build files as crude variables, others like the particular configuration to build are set on the command line. A third way to set properties is to encode them in the masterconfig. This ensures that the whole development team is using the same Android SDK level, for example, as the SDK package may provide several.

Most build system state can be controlled in the masterconfig file(*), and checking the masterconfig file into source control can be a easy and effective way of ensuring the team is building with consistent settings.

(*) Provided that the build script writers do not start using environment variables on the side :-)

Build Configuration

Framework also provides global build configurations that describe how to perform a build. These build configurations take high level build goals (ie: optimized 32-bit Windows build) and use it to determine the required compiler invocation command lines.

These build configurations are used as the base for building a project and all of its dependencies. Although a particular dependency may override its build settings for warning suppression, custom optimization settings, etc., for the most part these are slight modifications made to the global configuration.

SDK Packages

SDK packages are not handled any differently by Framework than other packages, but Framework's versioning forces consistent use of SDK versions by all developers. Many SDKs are packaged in ready-to-go forms that the build system will download automatically and use to build, skipping the hassle of having to find the right SDK package and installing it. Or installing it and its numerous components.

Package Server

Where versioned packages come into their own is as a basis for sharing technology between development groups and teams. To make this sharing easier we run a server for people to post their packages on, not entirely unlike Freecode/Freshmeat. If Framework cannot find locally a package listed in the masterconfig it will try to fetch it from the package server.

Less Make, More CMake

Owing to its NAnt heritage Framework has the ability to perform builds itself but with Visual Studio support demanded by most developers it is being used more and more as just a front end for generating project files for Visual Studio or Xcode.

Thursday, August 28, 2014

French Immersion

About this time last year I was traveling from Vancouver to Montreal and eventually to London for work. It wasn't the first time I had done that particular route but it was the first time I drummed up the courage to dust off my awful western Canadian anglophone elective-course high school French and speak la belle langue wherever I went.

Getting into a cab at the airport I asked the driver to take me to my hotel at neuf-cent boulevard Rene Levesque. The driver asked me where I was from, where I was going to, what I was doing in Montreal. In French. I answered, haltingly at first, only slightly less haltingly as old words I had not used for years started to bubble to the surface, punctuating my sentences with apologies for my awful French. He told me his English was worse and encouraged me to practice. We mostly talked about hockey for the rest of the ride.

Searching out dinner I figured I'd keep the momentum I had been building and ordered in French. The waiter responded in English. Dammit.

Two days later I was on the red-eye from Montreal to London. A lady sat down next to me. I asked her in English if she was traveling beyond London. She shook her head apologetically. "No English".

Ou voyages-tu, apres Londres?

"Ah! Syrie!"

It could have been "A Syrie" but I could swear there was an exclamation in the middle.

This was well after the Syrian civil war reduced formerly great cities like Homs to dust and ghosts. She was traveling to see her family whom, she assured me, was neither dust nor ghosts. We talked for the whole plane trip. In French. I think she knew more English than she let on, but she wasn't comfortable speaking it, and I knew no Arabic. We mostly talked about the past, present, and future of Syria.

Time has taken from me many of the the details of our conversation. I do not know where she was traveling within Syria.

After arriving in London I wished her well on the next leg of her trip and thanked her for the opportunity to practice my French.

I hope she was able to get out with her family.

Wednesday, April 17, 2013

C and C++ need inline directives at the call site

I don't like the inline keyword, I think it's incredibly broken.

Inlining heuristics make reasonable assumptions about what should be inlined but there's basically no control by the user unless they use something like VC++'s __forceinline directive. Automatic inlining works well with trivial functions, the kinds that populate the cctype/ctype.h header, but fails when the desired inline function grows beyond a few statements.

The ability to inline code can be quite a powerful for developers, especially those who had to develop for consoles like the Xbox 360 and PS3, but because the compiler has incomplete information on where the inlining really needs to go and because there's no way in the language to give it better hints, there is often bloat in areas where you don't want it, and no inline expansion in areas where you do.

I've worked around this in the past by abusing macro expansion, but it's not a very clean solution, and makes debugging difficult.

What I would like to see would be an ability to use the inline declaration at the call site so that I can pick and choose where the inline expansion happens as needed. One example I can think of is multiplying two matrices in a video game. In some cases, say in creating the view/projection matrix, an inline expansion of the multiply would be unnecessary code bloat because this operation only typically happens once per game frame.

Mat4x4 viewProjection = Mat4x4Multiply(view, projection);

While transforming joint hierarchies for animation there are typically a number of multiplications that happen in a tight loop where inlining would be beneficial:

for (int i = 0; i < numBonesInModel; ++i) {

Mat4x4 transformedJoint = inline Mat4x4Multiply(jointMatrix[i], jointMatrix[parent[i]]);

}

I want more control over what gets inlined and what doesn't, without resorting to macros, and I think being able to specify at the call site what I want inlined would go a long way toward that goal.

Sunday, March 31, 2013

Scheme in 5000 lines of C part 5: Collecting garbage

I was hoping to get this post up sooner but the last week and a bit has been complicated by my appendix exploding. I'm on the upswing though and now am starting to think well enough to be able to dig through my Scheme todo list.

Last time this series left off I began working on the garbage collector by adding a test to exhaust the heap and then re-wrote the evaluation engine. So I didn't actually do any work on the garbage collector.

A few years ago I worked on a small team that built a .NET environment for video game consoles which helped spark my interests in compilers and runtime environments. We tried a number of types of garbage collectors while trying to hit our performance goals, from simple non-precise mark/sweep, to Boehm, to several different types of precise collectors. Two designs we had the most success with, one was a half-space (Cheney) collector, and the other final design was based on an algorithm somewhat similar to Clinger's non-predictive generational collector.

At points I have experimented with a generational collector but I found that they work best when you can expand the tenured generation as objects are copied into it. As this isn't really feasible in a video game console where your memory budget is fixed and there's no pagefile to fall back on they never made the cut.

What I found while profiling with the games that were being built on this .NET runtime is that objects either died really quickly or they persisted forever (ie: more than a few frames). By breaking the heap into many small (typically page-sized) buckets, often when a collect was triggered there would be enough buckets that no longer had any live objects that the collect process could finish without having to move/compact the heap at all. This was a huge advantage to the half-space collector that we had used previously because all live objects would be copied in every collection.

What does this have to do with Scheme in 5000 Lines of C? The collector used in this scheme is going to use some similar ideas. For example, it's going to use small, fixed sized buckets. It's also going to compact the heap only if it has to. Luckily this garbage collector doesn't have to worry about the complications that a collector for .NET needs, like finalizers and threading.

I've got the initial root-set scan and collection of empty buckets implemented. The object handles I mentioned last time are now quite important because they act as read-barriers for code outside of the system so that they get an object's correct location if it moves underneath them. It's a bit cumbersome to use in C for now but this is a trade-off I'm making in order to make the collector fast, and enforce to the user (me right now) that it's a bad idea to peek and poke stuff in the scripting system willy-nilly.

One of the most hair tearing-out-ly problems I've found when writing a precise garbage collector is making sure that you do not miss any roots. If you do you'll eventually collect an object's space before it's dead and then you're in world of hurt. The read barrier is an easy way to make sure you do not miss any roots if you do not have the luxury of being able to control code generation so as to generate stack frame maps.

The reader also got a lot of attention today. Sorry, not you :), but the core of the code called when the user calls (read ...).I've weaned it off of the CRT allocation functions and it now allocates all of its intermediate structures in the GC heap. This has been on my todo list for a while.

The source tree was also reworked so as to be a little more friendly for use as a library -- a public include folder was created separate from the library source code.

I've got a few things up next to take care of. I want to start working on closures soon as well as heap compaction. Maybe before the end of the month I can get a few of the Computer Language Benchmark Game benchmarks written and added to the test suite as well.

View the source on GitHub: https://github.com/maxburke/evilscheme

Sunday, March 17, 2013

Scheme in 5000 lines of C part 4: (Un)necessary refactorings

Last week I added the first test of the garbage collector which simply allocated memory until the heap was exhausted. Since the garbage collector doesn't do anything now but allocate it was expected that this test would run until the gc_collect function was called where it would hit a breakpoint. The test was simple:

(begin
(define gc-test (lambda (count param)
(if (< count 1048576)
(gc-test (+ 1 count) #(1 2 3 4 5))
"done")))
(gc-test 0 0))

It calls itself a million times and allocates a 5-element vector each time. Since the test harness only uses a 1mb heap it should exhaust pretty quickly. (The compiler is pretty stupid currently and doesn't do any sort of optimization that would eliminate this allocation currently).

I ran into one issue first. Up until now, all my tests were testing compilation, so one test case would define a test function and the next would call it. Now, I wanted to do both in a single test case because I'm lazy and don't want to split all my test cases into one where definition happens and another where execution happens.

The system didn't have any ability to handle the (begin ...) syntax. This isn't the end of the world because (begin ...) is pretty easy to implement -- you loop over all the contained forms, evaluating each one, and the result is the value of the last form evaluated -- but it needed to be implemented in two places. Most primitive syntax features like define/set/car/cdr/etc. have an implementation for when they are encountered by the compiler, and an implementation in C for when they are evaluated at the top level. So,

(define foo 3)

and

((lambda () (define foo 3))

would follow two separate implementations. This stemmed from the early roots of "let's see if this will actually work", before there was a compiler.

But I was feeling lazy. I could have implemented begin for both both C code and the VM, but I decided to refactor the evaluation engine so that in the future I only need to implement functionality once. This was done by wrapping every top level function evaluation in an anonymous lambda and ensuring that is run by the VM. The example of (define foo 3) above would be translated into the form below and evaluated.

There were a few other changes that came out of this refactoring. First, procedure object storage is now broken into two parts, a record that stores meta information for the function, the environment in which the function was created, and some function local variable slots. These slots are not local variables in the sense of temporaries during execution but rather storage for objects that the function itself might need to refer to, such as other procedure objects (for closures), raw objects from the parser (ie: from (quote ...)), etc. I think this will make it much easier to tackle.

Second, I had to recognize the eventuality that I'd have to call C functions from the VM and that this new execution model would require it for things like (lambda ...).

Thirdly, I've introduced an object handle. This is meant to be a container that is tracked by the garbage collector so that, if a collection is triggered between two successive calls. Say, for example, you have this code:

struct object_t *foo = gc_alloc(...);

struct object_t *bar = gc_alloc(...);

And the second call to gc_alloc() triggers a collection because the heap was full. If the collector doesn't know about foo and the storage behind that object moves, we're going to be spending weeks finding out what has happened. The handle type is meant to make this more robust by basically placing a read barrier on that reference.

Next up is garbage collection.

(The project is now officially over 6000 lines of content, though only 4200 lines of that is code. That still counts, right?)

Follow on to part 5: Scheme in 5000 lines of C part 5: Collecting garbage
View the source on GitHub: https://github.com/maxburke/evilscheme

Monday, March 11, 2013

Scheme in 5000 lines of C part 3: Testing, random thoughts on closures.

I haven't made as much progress on the Scheme recently as I would have liked. Work's been busy, the house has needed some fixing, and I'm still chewing over how I want to implement closures.

Since I want to make *some* progress I rewrote the testing framework. Before I just had some sample strings embedded in main that I read-evaled-printed, and looked at the output to see if the changes I made had the desired effect.

I felt some (a lot of) shame because (semi-)robust testing is supposed to be my day job. So I moved most of that away into a test harness that's pretty stupid/simple but actually performs validation. Funny that.

There's a directory, tests/, that contains the test cases. One per file. Each file has the input that is fed to read-eval-print, the output of which is captured, and then compared to the expected value.

I'm still not quite sure how I'm going to handle closures though. Variable capture I think is going to be fairly easy. I'm not quite sure how I'm going to store references to inner functions though. For example, this function:

(define closure-test
(lambda (foo)
(lambda ()
(set! foo (+ 1 foo))
foo)))

The outer function (closure-test) will need to hold some sort of managed reference to the inner function so that when closure-test is called it can create a new function object and a new environment, but I'm not quite sure if I want to bake the reference into the bytecode, or do something else. Maybe re-write function objects to include a table of contained functions? That might work.

Follow on to part 4: Scheme in 5000 lines of C part 4: (Un)necessary refactorings
View the source on GitHub: https://github.com/maxburke/evilscheme