Sam Pullara was wondering if Java’s modern JITing and resultant performance improvements over Objective-C dynamic dispatch would give an edge to Android over iPhone. I contend the high level language isn’t where performance question will be answered, but rather in the API designs, let me explain…

I’ve worked on many different software systems that implement different parts of the system in different languages and runtimes for various reasons, but there is a general pattern in all of them that I’m going to call the Traffic Cop Pattern; The higher level languages/runtimes have APIs exposed to them that are implemented in a lower level language, or perhaps the functionality is even implemented directly in silicon. The higher level language/runtime ends up playing traffic cop to large amounts of data that actually gets crunched by faster code and/or dedicated silicon. The traffic cop orchestrates the flow, it doesn’t do the actual driving of every vehicle, much less the loading and unloading of the passengers. This conducting, driving, and passenger management is increasingly happening in parallel, possibly not even on the same box. What enables this to work well is the APIs between the layers, not the implementation choices of the highest level language.

Example: A web browser. Javascript orchestrates the native implementations of the DOM and CSS, it orchestrates the resource loading system, which in turn orchestrates the network stack and various levels of caching. The markup, an even higher level “language” than JS if you will, is directing the construction of the DOM, CSS structures and instantiation of the JS contexts. In turn the parsers and the runtime instantiations of the DOM and CSS engine are implemented in faster languages. Increasingly those systems in turn are calling APIs for hardware acceleration of animations, image decoding, compositing, video decoding, etc. Each level of the system is orchestrating the layers below it to work together to get higher performance all the way down to hardware acceleration.

Given that, the real performance wins will be from APIs that allow the programmer to coordinate all these systems from a higher level language, and get their (almost certainly) slower high level code out of the way. You don’t want to write another XML parser, do you? There is probably a faster version already. If you’re lucky it works well with the resource loading system (network, disk, whatever) to stream the data as it gets it. Maybe it doesn’t allocate memory to avoid causing a GC as XmlPullParser does on Android. You get the idea.

At this point I feel compelled to point out that Apple hasn’t yet exposed the H.264 hardware decoder to other developers on iPhone. Only in the last month did they expose it on MacOS, and Adobe immediately revved Flash to use it. Flash alpha on Android isn’t yet using hardware acceleration and it shows. If you control the OS, frameworks, sandbox and the hardware you can seriously put anyone else on your platform at a disadvantage at will.

The compiled vs. interpreted language question is interesting, but it is less relevant now that the V8 team has proven that you can use JIT techniques successfully even on a language as dynamic as Javascript and still get good results because the actual executing code is almost always reducible to a set of states, call graphs, and types that are finite and knowable. If you’re tracing you don’t even have to worry about anything other than what actually is executing as Dalvik and TraceMonkey do. After that it’s just a question of how many clever optimizations can you pull off, and do you have enough with one language vs. another that doing it up front (static lang compile) justifies a language choice. Yes, I know the Obj-C runtime hasn’t gone here yet, but I think one of the points Sam was making is that it could…

Lest you think this Traffic Cop pattern is new, think again:
The entire architecture of the IBM Cell processor

I’d like to call out Microsoft as perhaps having the best understanding of this pattern and applying it to the brave new multicore world