Tuesday 2 November 2010
For a while I've been thinking about exploiting fast browser JITs and JS "eval()" to build really fast emulators and other language runtimes. Tonight I was feeling jumpy so I went ahead and hacked something up.
I started by defining a really simple imaginary CPU instruction set, writing a simple program for that instruction set, and implementing a trivial interpreter to execute it. It executes 100M instructions in about 7.5 seconds (Firefox opt trunk on my laptop), which is already pretty good! But we can do a lot better.
Of course, there isn't anything special about JS here, you could do the same thing with Java or .NET bytecodes, or in other languages with "eval". Still, since there are a lot of browser-based emulators out there, it would be cool to see people try this technique. It's hard to predict how well it would do in a real emulator, but I have hopes. Real CPU instruction sets involve more complex instruction decoding than my toy, and the trace compiler is able to hide that.
Update I thought of a couple more simple optimizations overnight, plus Jesse suggested one, so I implemented them all in about 15 minutes. Developing compilers with an edit/reload cycle is fun!
- Jesse suggested using "new Function(code)" instead of eval(). Total time now 1.05 seconds.
- I wrapped the body of each function in "while (true)". Then, wherever the trace sets the new PC to the PC at the start of the trace (i.e., it's a simple loop), we can replace that with "continue;" to actually loop inside our trace function without returning to the trace dispatch loop. Total time now 0.8 seconds.
- I turned the "memory" and "regs" arrays into JS typed arrays (using feature detection to fall back for older browsers). Total time now 0.65 seconds.
So performance about doubled, now we're at 150MIPS. That's enough for one day!
(For perspective, the maximum clock of my laptop CPU is about 3GHz, so we're executing one emulated instruction every 20 native processor cycles.That's amazing!)
Oh, and just for kicks, totally unscientific benchmarks of other browsers. IE9 beta 1: interpreter 7.3 seconds, trace compiler 2.8 seconds. Chrome 7.0.517.41: interpreter 6.7 seconds, trace compiler 1.6 seconds. Neither of those browsers support typed arrays. I should note that typed arrays sped up the interpreter a lot in Firefox, now about 3.3 seconds.