Sometimes, cleaning up your code makes it slower even when it shouldn't. For example, you might extract code from a function and make it a separate function. Maybe the compiler should inline your code, but it doesn't, or it does but the result still isn't the same as the original single function. This is an abstraction penalty: you use the programming language to structure your code with additional abstractions, and you pay a performance penalty.
Grouping variables into structs or objects is a common source of abstraction penalties. For example, converting local variables nscoord x, y; to nsPoint pt; may prevent the coordinates from being placed in registers, in some compilers. Or the coordinates may need to be placed in memory if you pass an nsPoint to another function instead of two nscoords.
That aside, there's a much worse abstraction penalty associated with objects in common "safe" languages such as Java or C#: they require heap allocation and garbage collection. (I'm ignoring C# structs for now for reasons that will hopefully become obvious.) And despite what people will tell you, garbage collection is still slower than stack allocation, often by a lot. When I was doing serious Java programming a few years ago, the way to make Java code fast was still to avoid using objects or manually recycle them to avoid allocation and GC costs. And the problem is that a whole lot of language features are tied up with objects --- interfaces, casting, information hiding, code reuse. I suspect one of the biggest performance issues switching from C++ to Java or C# may be that in C++ you can stack-allocate objects, but in Java and C# you can't. Just look at all the places where Gecko uses nsAutoString or nsAutoTArray to good effect.
The problem is of course that in a safe language, you can't just hand out pointers to stack-allocated objects like candy, because someone might store one away in a dark place and then later, long after the method has returned, start using it as a dangling pointer. This cannot be allowed.
What the world needs is a type system that allows references to stack objects to be given out, but restricts their usage so that no-one can use them after the objects are destroyed. Fortunately many such systems exist. A whole family of "region calculi" have been developed, for example. However, I prefer "ownership types" as a more intuitive and generally useful framework; in particular I like Boyapati-style "explicitly parametric ownership types".
These work by adding an "ownership parameter" to each class type. These parameters are a lot like generic type parameters, but they denote the "owner" of the object. The basic rule is that an object cannot outlive its owner. This rule is enforced by the typechecker. Here's an example of how this can work:
class Vector<O@P> {
(O@P[])@this array;
int size;
Vector() { array = new (O@P[100])@this; }
void add(O@P o) { array[size++] = o; }
};
class Iterator<O@P, Q> where Q <= P {
Vector<O@P>@Q inner;
int i;
Iterator(Vector<O@P>@Q inner) { this.inner = inner; }
O@P next() { return inner.array[i++]; }
};
void f() {
C@method c1 = new C@method(), c2 = new C@method();
C@method c3 = g(c1, c2);
}
C@R g<R>(C@R c1, C@R c2) {
Vector@method v = new Vector@method<C@R>();
v.add(c1);
v.add(c2);
h<R>(v);
return new C@R();
}
void h<R,S>(Vector<C@R>@S v) where S <= R {
Iterator@method iter = new Iterator@method<C@R, S>(v);
while (C@R c = iter.next()) { ... }
}
This code creates two objects, puts them into a standard growable vector class, then passes the vector to another function which uses a standard iterator to iterate over the vector. What's cool is that the compiler knows a lot about the lifetimes of these objects. In particular the objects which are
@method, i.e., owned by "the current method", cannot outlive the method --- even though we pass these objects to other methods and even store references to them in other objects.
These properties are enforced in two ways. The main constraint is that code can only refer to objects whose owners are in scope. For example, you cannot declare C@R glob; and set glob = c1; in g to "leak" a reference to c1 --- the correct R is not in scope at the global level.
Similarly, an object like Vector that takes ownership parameters can only take parameters that "outlive" the object itself. This guarantees that fields of the object cannot become dangling references. In some cases, such as Iterator, we need to add "outlives" constraints to its ownership parameters; Q <= P would be read "P outlives Q". The semantics of ownership mean that "P outlives method" for all in-scope ownership parameters P.
Given the code above, the compiler could, using only local reasoning,
- Stack allocate c1, c2, v and iter
- Allocate space for g's result in f's stack
- Inline the array object directly into Vector --- the compiler knows that the array object cannot outlive its Vector, and because there is only one assignment, it cannot die before Vector either.
Therefore the above code requires no heap allocations at all. ("Inlining" objects into their containing objects is another important optimization, much like stack allocation, that common safe OO languages do not support.)
It's important to note that although a really good compiler could discover these optimizations in Java, for example, using escape analysis, there are major problems leaving it up to the compiler. The primary problem is that escape analysis is expensive and fragile and compilers don't do it very well. A secondary problem is that if you change your program so that escape analysis no longer works, the only way to tell is that your performance gets mysteriously worse. With explicit type constraints, you get a compile error instead. That's especially useful here because quite often when you create objects in a method you expect them to be method-local, and it would be great for the compiler to check that for you and give you an error if you're wrong (instead of generating code that's slow in Java or crashy in C++).
Another note is that although the annotation burden appears large, the code can be slimmed down considerably using sensible default ownership parameters and other shortcuts such as local type inference:
class Vector<O> { // O contains @P
O[] array; // fields default to @this
int size;
Vector() { array = new O[100]@this; }
void add(O o) { array[size++] = o; }
};
class Iterator<O, Q> where Q <= O@ {
Vector<O>@Q inner;
int i;
Iterator(Vector<O>@Q inner) { this.inner = inner; }
O next() { return inner.array[i++]; }
};
void f() {
// local objects default to @method
var c1 = new C(), c2 = new C();
var c3 = g(c1, c2);
}
C@R g(C@R c1, C@R c2) {
Vector v = new Vector<C@R>();
v.add(c1);
v.add(c2);
h(v);
return new C@R();
}
// implicit type and ownership parameters
void h(Vector v) {
var iter = new Iterator(v); // infer type parameters
while (var c = iter.next()) { ... }
}
There are even more tricks we can pull! For example, we know a @method object can't be accessed by other threads, so we can easily remove locking from it. In a transactional memory system we would know not to log accesses to it. When a method containing a @method object returns, we know the object must die, so we can deterministically run a destructor at that point, giving us the handy C++ pattern of using destructors for general cleanup on method exit. And when an object is about to go out of scope, we can often easily prove that the method's reference to the object is the only outstanding reference ("uniqueness"). In that case we are allowed to safely change the owner of the object to something else (which is never normally allowed)! This may require copying, depending on what optimizations have been performed. For example:
C@R f<R>() {
var c1 = new C@method(), c2 = new C@method();
var c = ... ? c1 : c2;
// may copy the object from our stack to elsewhere
return (C@R)c;
}
Tragically, this stuff doesn't really exist yet in a usable language, nor has anyone proven it to work programming at large scales. Sigh...