It seems like this is the summer of multicore panic. I can’t open a technical magazine without some pundit declaring that our current programming paradigms are useless when running on new multicore CPU architectures. Venerable analyst John Dvorak, who generally actually understands what he is writing about, coughed up this mess (with minor excisions for brevity):
Nobody wants to face the fact that Linux, Mac OS X, Microsoft Windows XP, and Vista are based on OS designs that are as old as the hills …
Meanwhile, none of these operating systems has the power to make multicore chips work as advertised. In the end, these chips are little more than novelties. Intel actually has the gall to claim that its extra cores can save power by automatically shutting down when they’re not in use. This just means they’ll be shut down most of the time. If the software could take advantage of these extra cores, there wouldn’t be any need to shut them down….
Why can’t operating systems simply dedicate extra cores to housekeeping chores and cool background tasks? It’s simple enough to work. Here are a few choice uses for core dedication. There are six of them, since we may see a six-core chip on the road ahead.
According to this piece in Wired, things are even more dire than that:
The problem is that many software applications weren’t written for chips with multiple cores, and the hardware is advancing so fast that the software runs the risk of being left behind.
“You can imagine a scenario where people stop buying laptops and PCs because we can’t figure this out,” said David Patterson, a computer-architecture expert and computer science professor at the University of California, Berkeley.
People are going to stop buying computers? Panic indeed!
This chorus of impending disaster reminds me of the build-up to Operation Iraqi Freedom (OIF). We’re hearing that there is a huge threat out there, and if we don’t do something quick, life is going to get real bad. Companies like Intel and Microsoft are investing in new compiler technology, other people are suggesting we need to completely change our programming paradigms. DDJ has set Herb Sutter loose with a concurrency column. Are the End of Days upon us?
But perhaps predictably, just like with OIF, the hard evidence just isn’t there. My new dual core system seems to run quite a bit faster than the machine it replaced. A similar quad core system runs even faster. Although John Dvorak seems to think those extra cores are doing nothing but enriching the electric utilities, something good must be happening. If there is trouble headed this way, it doesn’t seem to be here yet. Is it as close as some think?
The First Dose of Reality
First let’s look at the short-term predictions of doom. Dvorak says Meanwhile, none of these operating systems has the power to make multicore chips work as advertised. In fact, John is off base here. Linux, OS/X, and Windows have all had good support for Symmetrical Multiprocessing (SMP) for some time, and the new multicore chips are designed to work in this environment. Each core looks like a tightly-coupled CPU, and all of these modern operating systems are adept at using them.
Just as an example, using the spiffy Sysinternals Process Explorer, I see that my Windows XP system has 48 processes with 446 threads. Windows O/S is happily farming those 446 threads out to both cores on my system as time becomes available. If I had four cores, we could still keep all of them busy. If I had eight cores, my threads would still be distributed among all of them.
Modern programs tend to be moderately multithreaded, with individual threads dedicated to the GUI, to user I/O, to socket I/O, and often to computation. Multicore CPUs take advantage of this quite well. And we don’t need any new technology to make sure multi-threaded programs are well-behaved – these techniques are pretty well understood, and in use in most software you use today. Modern languages like Java support threads and various concurrency issues right out of the box. C++ requires non-standard libraries, but all modern C++ environments worth their salt deal with multithreading in a fairly sane way.
What Multicore Does Right
In fact, going from one core to two or four is not only not a disaster, it does a great good job of addressing one big problem with many programs: the CPU-bound thread. Good examples of threads of this variety include:
- Unzipping a 100 MB file
- Performing some complex operation on a 10 Megapixel photo
- The rendering thread in a 3D game engine
- Spreadsheet recalcuation
In the single-core days, a program that was executing a task like this would really knock a system to its knees. The O/S is generally smart enough to make sure that the CPU-bound task doesn’t get all the processor time, but the inordinate load does make it really hard for things like the UI to keep up. You’ve seen it – you have an hourglass cursor, the screen is redrawing at a .1 fps rate, and you can’t pull up a system menu to save your life.
On a multicore system, the situation gets a lot better. The task doing heavy computation might be tying up one core, but the O/S can continue running UI and other tasks on other cores, and this really helps with overall responsiveness. At the same time, the computationally intensive thread is getting fewer context switches, and hopefully getting its job done faster.
The Worst Case Future of Multicore
So two cores are better than one – nobody should be arguing that point, except possibly John Dvorak. Let’s suppose that Moore’s Law continues to apply to transistor size, and that Intel and AMD do nothing with that extra space but lay down new cores. If that’s the case, we should see a core quadrupling every three or four years.
In this future view, by 2010 we should have the first eight-core systems. In 2014, we’re up to 32 cores. By 2017, we’ve reached an incredible 128 core CPU on a desktop machine.
So what are things going to be like in 2010 with my eight-core system? First of all, it seems only natural that I’m going to have more processes with more threads running. Furthermore, with the eye-candy arms race between the three competing operating systems in full swing, you can bet there are going to be more threads that need massive amounts of CPU time. I’ll have more monitor real-estate, more pixels, and will probably have more media access. All of these things are going to tie up those eight cores just fine, using today’s programming paradigms with little or no change.
It’s a little harder to see this system scaling well to 32 cores, and at 128 cores it definitely starts to fall apart. We just aren’t going to have enough active threads on a system – many of those cores will be in Dvorak’s hypothetical idle state. So this may be when we start needing a new paradigm.
Of course, it’s pretty hard to predict the real path of both hardware and software architectures ten years out. Things like the conversion to a worldwide networked infrastructure, the universal adoption of GUI desktops, a fairly quick changeover to true multitasking operating systems were all changes that seem obvious in retrospect but weren’t so obvious a few years ahead of time.
Likewise, we speculate that silicon fabrication technologies will progress at a steady rate, but the overall architecture of desktop PCs are influenced by many unpredictable factors. And remember one thing about Moore’s law: “past performance is no indication of future returns.” Just because it has held up extraordinarily well for virtually the lifetime of IC fabrication, it was an educated guess to begin with, not a Hari Seldon-like map of future technology.
So making big changes to our software development systems based on a 10-year horizon is pretty risky.
It’s Worse Than That
Assuming we have 128-core machines in years, the doomsayers feel that we need a new, massively parallel approach to software. As much work as possible needs to be spread out among multiple CPUs.
Further, our current techniques for communicating between threads are judged to be impractical. We’re told that we need lock-free transaction systems that allow these highly distributed algorithms to communicate freely and efficiently.
Okay, if we concede that, then we have a problem.
Let me take you bake in time to my tender undergraduate days in the late 1970s at the University of Virginia School of Engineering. In a class on computer architectures, we were having a long discussion about parallel algorithms, which back then were being used on the current generation of supercomputers, such as the Cray-1. I speculated to my young and optimistic professor, Alfred Weaver, that perhaps the human mind just wasn’t cut out for designing parallel algorithms – maybe we’re just built linear. Dr. Weaver wasn’t dogmatic about this, but speculated that we didn’t really have much experience with that frame of mind, so it was too early to rule it out. He also pointed out that even if we didn’t have the ability to parallelize linear algorithms, it may well be that advanced compilers could do the job for us.
Thirty years later, I’d say programmers aren’t thinking in terms of parallelism any more than they were back then. We have separate threads of execution, but they might as well be separate tasks – not parallel distributions of one algorithm.
And of course, thirty years ago computer generated parallelism was just one of many technologies that was just around the corner. Another was artificial intelligence. And of course, continuous natural speech recognition. Ah, we were dreamers back then.
Well, artificial intelligence is still just around the corner, and some people think it may be one of those problems that we might have to concede we aren’t going to solve. Could generation of code for a massively multicore environment be just as fanciful?
Microsoft says in recent EE Times article that we’re at least ten years out from having proven compilers and computer architecture to be massively parallel. (This article gives a nice detailed roadmap of what people think we really need to do in order to take advantage of the multicore future. It’s not pretty.)
If Microsoft says something is ten years out, we know that it is either:
Never going to happen
Will take at least ten years
Will be done by a competitor next year
I’m inclined to believe something between the first two options. This is just a really hard problem, and if it takes ten years to get the tools ready, it’s going to be thirty years before the application base is converted. I mean, are you ready to start writing all your programs in a functional programming language? Probably not. I just learned to spell Haskell.
What, Me Worry?
So far I’ve said that we’re okay for the near term, but the far term is going to be gloomy if we have huge numbers of cores. But now I’m going to tell you that we shouldn’t be worried about the far term either. Yes, everything is going to be okay.
First and most important, Intel and AMD don’t operate in a vacuum. They are producing dual- and quad-core machines today because they give nice performance gains. Intel knows darned well that if they produced a 128-core machine today (which they could probably do, for a price) they wouldn’t sell any, because we can’t take advantage of that technology.
So Intel, AMD, Microsoft, and the other big players will be engaging in an interesting dance. Some steps will be development of new hardware, but other steps will be investments intended to pay off with new compilers, languages, and paradigms designed to work with that new hardware.
And if those investments don’t work, and the software doesn’t come through, will the world come to an end? No. Instead of jumping from 16 to 32 cores, Intel will accept smaller performance gains by cranking up clock speeds, boosting the instruction set, making bigger caches, or whatever it takes.
Our industry press thrives on a good crisis. The switch to multicore processors has presented the brain trust with the opportunity to drum up a convincing one, and they haven’t let us down. Just try to take it with a grain of salt. The crises we’ve had in the past have mostly been resolved with boring, step-wise evolution, and this one will be no different. Maybe 15 or 20 years from now we’ll be writing code in some new transaction based language that spreads a program effortlessly across hundreds of cores. Or, more likely, we’ll still be writing code in C++, Java, and .Net, and we’ll have clever tools that accomplish the same result.
Either way, the beat will go one.