This is the 3rd and last part of this blog! (thank heavens!), where I wax poetic about Lattices and Type Theory, and their applications to Compilers and in particular Java JITs.

Part 1: Too Much Theory

Part 2: Too Much Theory, Part 2

## A Quick Recap

We’re building a lattice, to allow us to do exciting optimizations on Java programs. Our lattice has distinguished * top* and

*types, and is Commutative, Associative, and Symmetric. We understand pointers can be*

**bottom****null**,

**not_null**, or unknown (e.g.

**). Because of symmetry across the lattice centerline, we also have the dual of**

*bottom***not_null**which we’ll call

**any_null**. For this recap, we’ll also have Java Classes (e.g. String), but we’re going to ignore subclassing for the moment. Our lattice looks something like this:

Wow, that’s getting fancy… lets revisit some of the elements in this lattice real quick:

: All possible values, including**bottom****null**, as computed by running the program. No constants.**String:bot**: All possible Strings, including**null**, as computed by running the program. No constants, no XMLNodes. For brevity I will sometimes refer to this as**String**(no trailing**:bot**notation).**XMLNode:bot**or plain**XMLNode**: All possible XMLNodes, including**null**, as computed by running the program. No constants, no Strings.**bottom****:****not_null**: All possible values, as computed by running the program. No constants, no**null**.**String:not_null**: All possible Strings, as computed by running the program. No constants, no XMLNodes, no**null**.**String:hello_world**: The constant String “hello world”, and nothing else.**null**: The constant**null**.**String:any_null**: All possible String constants, all at once. No XMLNodes, nor**null**. An impossible situation in a real program, but a useful mathematical notion when running Constant Propagation.**String:top**: All possible String constants and the**null**constant, all at once. No XMLNodes.: All possible constants, including all Strings and all XMLNodes and**top****null**, and all them all at once.

Notice the symmetry: ever Node below the centerline of constants also appears in dual form above the centerline. The edges are also symmetric: e.g. * top* has 4 edges out and

*has 4 edges in,*

**bottom****String:top**has 1 in, 2 out, and

**String:bot**has 2 in, 1 out, etc. This lattice will let us find the constants in code like this example from last time:

final class LinkedListThing { LinkedListThing _next; // instance field int hashCode() { int hash = 0; Object x = this; // x is LinkListThing:not_null while( x != null ) { hash += x.hashCode(); // x is LinkListThing:not_null if( x instanceof LinkedListThing ) x = ((LinkedListThing)x)._next; // x is LinkedListThing else // with Conditional Constant Propagation... x = "fail"; // ...this path is dead // x is LinkedListThing:bottom and not a String } return hash; } }

## And Now Gentle Readers, Let Us Ponder Subtypes

Java, of course, has subtypes. So does C++ and many other languages. Lets take a look at what it would take to add subtypes to our lattice. This tiny program snippet makes exactly a Hashtable:

Hashtable x = new Hashtable(); y = x.get(); // Calls Hashtable.get

And this snippet makes a known subclass of Hashtable, but treats it like a Hashtable:

Hashtable x = new Provider(); y = x.get(); // Calls Provider.get

And this snippet mentions an UNknown subclass of Hashtable:

void foo( Hashtable x ) { y = x.get(); // Calls ???

When does ‘Hashtable x’ refer to exactly a Hashtable, and when does it refer to some subclass of Hashtable? Knowing if some value is **exactly** a Hashtable versus a subclass is very useful: we can optimize calls made to exactly known classes. Example: What function is called when we call “x.get()”? Well, if x is exactly a Hashtable, we get Hashtable.get() … and we can inline “get()”. If x is a Hashtable-or-a-subclass, then “x.get()” might be Provider.get() or Hashtable.get() or some other user-specified derived version of “get()”, and we cannot inline. The job of figuring out if ‘x’ is exactly of class Hashtable, or some subclass of Hashtable falls to Constant Propagation – and that requires we represent this notion of ‘exact class’ in the lattice.

Lets add the notion ‘**Hashtable:exact**‘ to mean **exactly** a Hashtable, and NO subclasses, while plain ‘**Hashtable**‘ allows subclasses. This is an independent axis from allowing **null**, so we can still have e.g. **Hashtable:exact:not_null**. Note that final classes like String cannot subclass and so are always **String:exact**; constants are always the exact class that they are, e.g. **Hashtable:exact:0x1234**.

I have another “Lattice” for you to ponder, with Hashtables instead of XMLNodes. I’m still using * bottom* in places but I might as well use

**Object:bot**or just plain

**Object**. Also, I ‘twisted’ the picture slightly: lattice elements in the middle all exclude

**null**, and those on the outer edges all include

**null**. I had to duplicate the

**null**element to make the graph lay flat visually, but really they are the same element. I could have laid the graph ‘flat’ on a cylinder, but the web’s not up to 3-D visualization yet.

## And Now The Punch Line

And now for the Trick Question: what happens when I end up mixing together (with Phi functions on loop backedges) **Hashtable:top** with **String:exact:not_null**? (A different question is **how** I come to such a situation that I attempting to mix these types… but trust me I’ve seen this happen in QA from suitably complex programs!) So back to my Question: I am look for the **most precise** answer possible, anything less and I’ll be losing type information for no good reason. So for example * bottom* is a correct answer, but very conservative – we can do better than that!

How about **bottom:not_null**? Since for **Hashtable:top** the compiler can pick any Hashtable it wants – we want it to pick a Hashtable, ANY Hashtable. The result of mixing that with String:exact:not_null is … some random not-null Object: **bottom:not_null**. This might let me remove a null-pointer check at compile time.

But we could also take just the **null** from **Hashtable:top**, since the compiler is allowed to pick that instead! Remember: the **top** label means the compiler “gets to choose” during the course of Constant Propagation – and picking a more precise answer makes it more likely we’ll find a constant and have our choice remain correct. So mixing **null** and **String:exact:not_null** yields **String:exact**. This might, e.g., let me convert an unknown x.hashCode() into a known String:hashCode() call.

So which one do I pick? **bottom:not_null** or **String:exact**? Typically I have no idea during the course of CP which one will pan out better – or actually if either will lead to a better final answer. The Real Trick here is: *I should not be required to pick*. The Constant Propagation algorithm relies on having a lattice – and lattices are defined as having a unique lower-bound (and unique upper-bound) for any two elements. This structure does NOT have a unique lower-bound, and so *is no longer a lattice*! For some particular Java program and some particularly bad luck with our CP algorithm – the CP algorithm can *hang indefinitely*, or miss some constants on some runs – but find those constants on other runs *for the same Java program*. Garbage-In Garbage-Out. CP is technically wedged.

In practice HotSpot’s CP does not hang, although I suspect I can carefully arrange a Java program for which it would. Instead, I end up triggering asserts in QA about how my lattice is lacking the proper symmetry (and hence losing constants for no other good reason, but only in weird programs). I *did* construct 2 simple programs for which choosing **bottom:not_null** versus **String:exact** would find a constant (and enable an optimization) in one program and not the other… and vice-verse for reversing choices.

Cliff

## Postlude

As you might have guessed by finding this blog, I’m off to a new adventure – this time **using** a JVM instead of **building** one (well maybe: I’m very pragmatic; may the best language win!). And while I’ll probably still be tinkering in HotSpot from time to time, I think my future blogs will mostly be about my new adventure!

— till next time, Cliff