Beyond Constructors: Object Readiness
 
One key problem addressed by Java is object construction. In Java, objects are created with the new operator by calling one (or more) special object methods called “constructors”. Once an object’s constructor(s) have been called (along with any superclass constructor(s)), an object is ready for use. The root trouble with constructors is that during this transitionary period, an object passes through a series of states where it is not yet fully usable. This requires that the language impose a number of rules around object construction which can feel quite awkward and restrictive at times. Although it may be a moot point for Java, my question is this: in future languages, do we really need constructors? And if not, what might replace them?
It seems to me that constructors actually address a very narrow subset of a much larger set of (generally unaddressed) problems regarding object state. By contrast, it seems that the present new operator and its semantics is wider and more complex than it might be. 
Suppose all methods were created exactly equal (and the static keyword were abolished and all members defaulted to private and all references to final... but I digress...). A Class object might have a new method (override-able, of course) which creates new instances of that class:
class Class {
  public new() { 
    return new this;
  }
}
The new operator in this scenario is reduced to simple allocation and the object returned is in a state of readiness (or lack of it) called ALLOCATED. Methods may be invoked on the object at any time once it is allocated. However, some methods may require initialization to occur because they depend on fields that are not (or may not be) initialized. To ensure proper initialization for a given method, rules could be specified:
class MyClass {
  Integer value;

  public void setValue(Integer value) { 
    final this.value = value; 
  }

  public Integer square() { 
    return value * value; 
  }
}
In this case, the setValue() method makes the value field final, ensuring that it can only be called once. The square method, since it uses value, implicitly requires that setValue() be called first. The compiler can deduce that the object it not ready for square() until setValue() has been called. It would seem, with proper analysis, that only fields which escape static analysis would actually need to be zeroed out. This would speed up object allocation without compromising robustness or security.
So what does the use case look like?
MyClass my;
my.setValue(9);
my.square();
Right. You never use the new operator. You simply start using objects and the compiler tells you if you try to do something the object isn’t ready for.
But what about methods where it’s impossible to tell what got initialized?
class MyClass {
  Integer value;

  public void setValue(Integer value) { 
    if ( ... ) {
      this.value = value; 
      final value; 
    }
  }

  public Integer square() { 
    return value * value;
  }
}
Well, value would have to be zeroed out by the compiler. Then a runtime check would be added to ensure that value gets initialized. If it hasn’t been initialized and square() is called, the object would throw an exception. The compiler could also warn about this inefficiency with a message like “uncertain object readiness in square()”.
With a first-class property syntax, the need for constructors pretty much disappears. What’s more, object initialization becomes self-documenting as the initializing properties all have names.
Besides the implicitly determined readiness of fields to participate in method operations, objects may also have logical readiness. This goes beyond construction into the realm of program correctness. For example, it may only be valid to call download() if a call to connect() succeeded. If a call to connect() succeeds, the object is in a logical state CONNECTED. If the call fails, it remains in the logical state DISCONNECTED. Rather than rely on documentation to ensure people use the API correctly, the implicit state machine involved in assessing method readiness here can be made explicit:
class MyClass {
  state DISCONNECTED;
  public boolean connect() { 
       ...
     if (succeeded) {
       state CONNECTED;
     }
     return succeeded;
  }
  public download() {
    require state CONNECTED;
      ...
  }
}
Now, if the compiler sees you do this:
  MyClass my;
  my.connect();
  my.download();
as opposed to this:
  MyClass my;
  if (my.connect()) {
    my.download();
  }
it can see the potential problem and complain “Object may not be ready for download()”. Of course static analysis will not always succeed, but the automatic run-time check will pretty much always be better than hand-coding the check (or worse, not coding it).
Saturday, March 28, 2009