Thursday, 11 January 2007

Fortress

I spent some time today reading the Fortress manual. Fortress is a new parallel programming language for scientific computing being developed as part of DARPA's high-performance computing initiative (think supercomputers simulating weapons). It's very interesting work, particularly so to me since I did some work on X10, IBM's language contribution to the same project. (Cray has one too, called Chapel). It has been a couple of years since I last looked at X10 and no doubt it's changed considerably during that time.

It's very impressive --- no surprise since Guy Steele is deeply involved. There are a several important ideas:


  • Major syntax and Unicode work to make it possible to write code in mathematical notation
  • Traits-based object system
  • A units type system for scalars (e.g., "x := 10 kg/s")
  • Very powerful generics, with static value parameters as well as type parameters, and very free usage of "where" constraints
  • Explicit component programming mode
  • Various kinds of built-in parallelism, in particular, "for" loop iterations run in parallel by default
  • Inter-thread communication via "atomic" blocks, not locks, reduction operators, and the ability to use explicitly spawned threads as futures

I think some of these ideas work out better than others. I think the component programming features, where components are added, removed and upgraded in "fortresses", may prove to be a mistake. It can be very difficult to integrate that sort of thing into IDEs, source code control systems, development processes, build systems, etc.

The generics system has some rather unusual properties compared to similar systems for other "real" languages. For example you can write "trait C[T] extends C[E] where E extends T" --- i.e., declare that C is contravariant. But that's just the beginning, it seems you can create any network of constraints you like, with free type variables (e.g. E) not bound in the signature of the declaration. You can also write scary stuff like "object C extends D[T]", i.e., C has every method declared in D for all T, i.e., C has a possibly-infinite number of methods. It'll be very interesting to see what happens when they try to prove decidability of typechecking. It'll be even more interesting to see what kind of error messages the compiler produces.

I think the drive for ubiquitous implicit parallelism plus the fact that by default actions on shared data are not atomic is ... bold. It may work out OK for the scientific programming audience. For wider use I think we really need to restrict thread interleavings and eliminate implicit races. For example, Fortress's "for" loop can execute iterations in parallel and in any order, but for a general purpose language (like oh say Javascript), I would have the iterations appear to execute in any serial order --- i.e., run them in any order but make each iteration atomic.

Their approach to reductions seems really good. E.g. you can write (not real Fortress syntax):

x := 0;
for (i in 0..99) {
x += a[i];
}

They promise to notice that x can be computed efficiently via reduction --- i.e., that you can run the loop iterations in parallel and compute x in parallel too, using a tree-structured computation of logarithmic depth. This actually works by exploiting the type system, which declares via traits that addition over numbers is a monoid (or something). In general the standard libraries encode the rich mathematical structure of the fundamental types via the trait system; very cool.

The drive for mathematical syntax is an interesting experiment. I really wonder how it will work out. I think much depends on their ability to create a good IDE --- not just an editor, but also a debugger, not to mention pretty-printing compiler errors. And then we have to see whether people prefer writing programs that way. They argue that mathematical syntax has evolved over thousands of years, which is true enough, but it evolved under different constraints than we're dealing with here. It will make for awesome demos at least.

There's a nod towards exposing to the programmer the distribution of computation over nodes, via memory "regions" and explicit array distributions, but they don't take it as far as X10, where data location was part of the type system and all non-local references were statically exposed and explicit. It's hard to say right now what the right direction is.

Altogether lots of fun. It'll be great to watch how things play out.



6 comments:

  1. Your reduced addition loop is nondeterministic in IEEE floating point. That's a pretty strange design decision.

    ReplyDelete
  2. *It can be very difficult to integrate that sort of thing into IDEs, source code control systems, development processes, build systems, etc*
    True; see the completely flawed Firefox-extension mechanism.

    ReplyDelete
  3. Ben - interestingly enough, I think they deal with this in the mathematical traits - there's Distributive, and ApproximatelyDistributive. Pretty much all the math traits have approximate forms. I'm sure they did this for the very reason you suggest.
    I've been reading the fortress manual too - very interesting stuff. I'm not sure what you mean by constraints, and i didn't read the traits section, however, properties aren't enforced, only used for 3rd party tools/unspecified compiler checking. From their useage they appear to be very powerful indeed. Perhaps, as you point out, too powerful.
    BTW, nice layout, and nice title (tolkien reference)

    ReplyDelete
  4. It seems the generic system is just a stripped version of the one in C++.
    It's kind of funny, you code in C++ but you don't even know that language.
    Actually all those features are available in C++, most as libraries, showing the power of expressivity of the language.
    Of course, the mathematical syntax you can get in is probably quite limited by C++ syntax. Things like units aren't really a problem though.
    From what I have seen, the Unicode support is fairly limited too, like the one in Java.
    Only "atomic" blocks can't be done. Well they could be done through locks, but you can't just implement a transactional system like that.

    ReplyDelete
  5. Robert O'Callahan1 February 2007 21:58

    C++ templates don't support modular typechecking, so there is no type constraint system to even compare with Fortress's.

    ReplyDelete
  6. They do.
    Be it with C++09 concepts or other alternative techniques, available today in boost.

    ReplyDelete