I Hate Makefiles (and other short stories)

I Hate Makfiles.... due to long boring bad history (from 1995) supporting a giant nested 18-layer-deep rats’ nest; where a simple ‘make’ that does nothing took 3 minutes so that the minimum edit/compile/build/debug cycle starts with a 3-minute delay – and the mess is so incomprehensible that it takes a full-time engineer to make any changes.  This on a 750KLOC project with maybe ~400 files.  I figure checking time stamps on 800 files (400 .c and 400 more .o) should take maybe a second on cold caches, and should be not much harder to read as a list of 400 lines of files.  I finally got fed-up with the mess and wrote my own flavor of make.


This ‘make replacement’ was a great success (from my point of view) – it was instantly quick to do an empty make, it was all in 1 file (and that file was some boiler-plate then a simple list of dependencies), it supported parallel builds (without interleaving output from the different build steps), it was cross-platform, it built cross-platform, it sang, it danced and it fit on the head of a pin. To be fair, it was well received by the project community (and especially the engineer tasked with maintaining the old stuff in his spare time) and eventually everybody moved over to the new setup.  So when high-scale-lib needed a build system I decided to ignore Ant and all other project-build systems and repeat this success in Java (yes I’m being tongue-in-cheek here with some NIH thrown in).


Here’s what I did…

  • It’s All One File – no hiding tedious-but-required details in files buried deep in obtuse include-file paths.  Called build.java here
  • It’s A Common Programming Language – (was C, this time Java); no exciting new syntax to learn (e.g., all whitespace is ok, except for tabs which are very special – OR – piles-o-XML-brackets harkening (or is it horkening?) back to the LISP days).  The project builder itself thus comes with a complete set of debugging & performance tools.  If I need a special kind of build-step, I can hack it in.  Example: some silly tool I’m using doesn’t properly set the OS status, so I have to ‘grep’ the output for ‘error’ to tell if the build-step failed
  • It’s In the Obvious Place (root of the project)
  • It Stands Alone (except for a handful of well chosen tools: was ‘cc’, this time a JDK).  Things like ‘rm’, ‘touch’, ‘cp’, ‘cmp’, ‘grep’, etc are all built-in and need no to configuration
  • Up-to-date is Based on File Timestamps – and nothing else.  I can tell the project is good with nothing more than my eyeball and ‘ls’.  Sometimes this invariant is annoying and may cause some small dummy files to be needed… but it’s worth it in the long run because the definition of ‘up-to-date’ is easily understood by all
  • Dependencies Are Obvious – They are written as a data-structure.  I’m trading off some verbosity for explicit clarity here.  Searching in an editor is instantly quick (no pain for lines of nearly-replicated stuff), so finding things is fast and it’s always immediately clear what they mean.  Here’s the definition of a Java source file org.cliffc.high-scale-lib. NonBlockingHashMap.java and the build-step to produce the class file:


  static final Q _nbhm_j   = new Q (HSL+"/NonBlockingHashMap.java");
  static final Q _nbhm_cls = new QS(HSL+"/NonBlockingHashMap.class", javac, _nbhm_j );


The dependencies have some boilerplate (e.g. “static final Q”), but it’s all obvious stuff including the boilerplate.  No new syntax.  Full editor support for writing dependencies, etc.


Makefiles change (or at least they do if you can understand what the heck is going on) – and when they do the set of rules which defined the ‘goodness’ of your build gets whacked.  So my uber-make has to build itself:

  • It Builds Itself – a quick check of build.java vs build.class timestamps, then apply javac to build.java, afterwards fork ‘java build $*’ to do the build in the New World Order.


That’s the Big Picture.  After that I add features as I need ’em.  Here’s what I got so far:

  • -n – list build steps without building
  • -k – keep going after errors
  • -v – verbosely list what’s going on. 
  • -clean – Nuke buildable files
  • Shortcuts to build with javac, build javadocs, build jars, and build, run & verify with JUnit tests.  In short, short-cuts are easy to add.  Anytime I see I’m repeating the same kind of build-step over and over again, I make a shortcut.
  • Limit output of noisy build-steps.  Default is 1 line of output per build-step on success (just an echo of the line), and full spewage on failure.  This makes log files tidy and easy to read.  I’ll probably put this on a flag someday for people who like to see a zillion lines of successful build log, but I like the 1-line-per-success as the default.
  • Sanity checks that build steps actually build what they claim to build, and do not muck with other files (common error in hacking makefiles is to get the build-step messed up and have it produce the wrong file in the wrong place, or worse whack a source-file by mistake).


Things I’ve had in the past and will probably get around to adding eventually:

  • Use ‘gcc -M’ to automatically track and gather include-file dependencies.  No user interaction required, other than adding or removing ‘#include’ directives.
  • parallel build, caching of .classes, using a javac bean server (with auto-launching on the first compile)
  • Build slowest files first (means you don’t end a 200-file parallel ‘make’ on an 18-machine build farm by compiling the largest file last – guaranteeing another 5 minutes of build with 17 machines idle)
  • Build recently failing files first – these are mostly like to fail again instantly (assuming you did’t get all the syntax errors out in the 1st pass)


Something I’d like to add this go around:

  • Verify the build-step doesn’t access any files other than it claims it needs.  This is another sanity check to cover another big source of makefile errors: forgetting a dependency.  I probably need to get ‘last access time’ for files to do this, and I don’t know if this is portably available.


Look for build.java in high-scale-lib and tell me what you think!



Adding Transactions to the Non-BlockingHashMap

I went to the winter meeting of DaCapo, a small group of senior academics and industrial scientists mostly focussed on GC & Memory-related issues (but with fairly wide interests).  Much of the work is not yet published; work-in-progress stuff (lack of rigorous experiments, or badly broken implementations, or Big-Ideas-No-Code state, etc).  So I won’t repeat what I saw here directly, contact the authors directly if you’re interested in something.  The DaCapo group has decided to come out with a Java Server benchmark, similar to their popular DaCapo Benchmark Suite.  This will be something that’s much much easier to run than SpecJAppServer (that won’t be hard! SJAS is a total nightmare to setup) and much more realistic than SpecJBB2005.


Of the DaCapo material that’s already published, I like Tracking Bad Apples the best.  I’m going to float this one around Azul and see if we want implement it.  I also stumbled over this gem while perusing Emery Burger’s page.


Adding Transactions to the Non-BlockingHashMap


One of the other things that came out was the result of a quick conversation between Rick Hudson, Eliot Moss and myself: adding software transactions to the NonBlockingHashMap.  A transaction would mean that a single thread could do a series of reads & modifications to the HashMap and either abort them all, or commit them all atomically.  If committed, no other thread would see any partial updates and this thread is assured that no other thread modified any K/V pair read during the transaction.  Transactional support would go a long ways towards making a (fast, scalable, non-blocking) in-memory database (A C I but no D).  One of the reasons this idea looks so tractable, is that all the accesses to the transacted memory are already wrapped in function calls – there’s no requirement for a “below the JVM” or “inside the C compiler” implementation.  It can all be done in pure Java.


The basic idea here is to add another State on the Value slot, representing a “transaction in progress”.  Non-transacting threads seeing this state will be required to do more work: at least 1 more layer of indirection to get the actual value, and at least 1 flag check to decide if they need to report the before- or after- transaction-committed value.  The State would be implemented as a variant of box’ing like I do now: an instanceof test reports that the Value is really a Box, and the reader of the Box now reads through the Box to get the actual value.


I believe I can implement this in the State Machine that drives the NBHM, or at least I have a back-of-the-envelope FSM that handles 2 conflicting transactions.  Transactions would be non-blocking  (so no live-lock – always some thread gets to complete a transaction) – but not fair (I couldn’t find a Wikipedia article on the notion of fairness, although there are plenty of articles with the word fairness in their title).  It’s possible for a thread with long transaction to be continuously aborted by other threads with doing repeated short conflicting transactions.  In DB’s, I suspect fairness is a big deal.  In non-blocking algorithms it’s tough to implement: the endless winners need to stop winning at some point and help the endless loser to win every now and then.  The problem is, the winners are different threads than the loser and can’t actually execute the loser’s code directly (other than the little bits inside the NBHM calls).  If I block the winners to let the loser get through, then the algorithm becomes … well…, blocking.  I can ‘pause’ the winners for a fixed period of time and retain the non-blocking title, although this seems messy.  Does somebody have a better idea here? 


Another, API-related question: should the transactional calls be under a completely separate API, or should I just have “begin_transaction”, “commit”, and “abort” calls, and assume all the calls in the middle are being transacted over?  What happens if a thread starts a transaction, then does a non-transactional conflicting write (abort the transaction?)?  Do I allow more than one thread to work on the same transaction?  i.e., make a “transaction object” and allow it to be passed around, and require it in transactional calls?