When Apple first revealed the new Mac Pro, I criticized the design in posts and in my regional newsletter column, saying Apple could have done a better job in three ways. The GPUs need to be upgradable, there should be more RAM slots, and certainly more internal SSDs.
I would also add that Apple needs to provide better support for OpenCL and OpenGL. So, that's four complaints, and it turns out that Apple is responsible for two of the four, while Intel is to blame for the others.
But, it's important to know who will be using this Mac Pro. Apple has been touting this machine for 4K video editing. In my previous blog post, "Who Was the New Mac Pro Designed For?" I talk about the need for a machine that does serious number crunching.
Now is the time to discuss the limitations of the Mac Pro.
The first complaint is the lack of upgradable GPUs. The architecture of the Mac Pro includes two GPUs called the AMD FirePro. These GPUs have very fast RAM, and a good amount. The FirePros come in three flavors called the D300, D500 and D700.
These GPUs have the power to run three 4K monitors and high end video editing applications. They can also be used to supply lots of number crunching capacity to programs that crunch numbers for hours and hours. As I discussed in the previous blog, this is my area of interest.
The FirePro GPUs Apple is using are among the most powerful GPUs you can purchase today. The primary difference to those you might purchase for a Windows box is they are clocked a little slower to reduce power and thermal requirements, and Apple's FirePros do not contain ECC RAM.
But one thing we all know is that next year, there will be more powerful GPUs, and the Mac Pro you buy today will be out of date no later than two years after purchase. Is there a technical difficulty in making the GPUs upgradable or does Apple think they will sell more Mac Pros this way? The fact is they will sell less, because those engineers and scientists who might buy a new Mac Pro will realize they cannot justify the short life of the machine to their bosses.
My second complaint is the limit on RAM slots. The new Mac Pro has four RAM slots, which with today's available RAM will hold a maximum of 64 GB of RAM. To someone using a Mac to get on the Internet, do some email, create presentations, and so on, this is a lot of RAM. To people doing high end 3D work, performing complex simulations and calculations, it might be OK.
To the folks at www.realearthmodels.com who create detailed 3D models of the real world, 64 GB is just a start. They would prefer 128 GB of RAM, more if possible. But, there are reasons Apple limited the Mac Pro to four slots, which will be discussed a little later.
My third complaint is the single internal SSD. SSDs are faster, seem to be more reliable (ask the folks at ProSoftEngr about the people who come to them with problems). It's difficult to find an SSD of one terabyte capacity, and nothing larger. Using Real Earth Models as an example, a 1 TB SSD *might* be able to hold system files, applications, and a single model of the complexity of Crazy Horse.
People will argue external Thunderbolt 2 hard drives should be used. Beyond the problem that there are few true Thunderbolt drives available, the Mac Pro suffers from some I/O issues that could slow the external drives (discussed later). And, I have yet to see any good reviews of Thunderbolt that delve into the question of latency. Anyone who argues that Thunderbolt is really fast, and the latency isn't an issue has never dealt with computations that take hours or days. These computations will access storage millions of times, and even an extra nanosecond of latency added by external storage vs internal storage will add up to significant time.
My fourth complaint is the lack of support from Apple for OpenCL and OpenGL. Support for these graphic and computing programming interfaces is extremely important to those who need the Mac Pro. Apple invented OpenCL, yet is not fully supporting it. OpenGL is so poorly supported in OS X that speed tests have shown you are better off running the older generation Mac Pro in Windows than in OS X for OpenGL work where performance matters. There is no evidence that Mavericks on the new Mac Pro services OpenGL any better.
So, there are four complaints. Apple is responsible for the lack of upgradable GPUs and OpenGL and OpenCL support. Intel is responsible for the limited RAM and single internal SSD. How? Read on for the technical nitty gritty.
Based on Anand's work of deriving the Mac Pro's architecture, it is possible to say that Intel has limited the I/O of the Xeon chips used in the Mac Pro and other workstation class machines. I have no fear of basing my conclusions on Anand's efforts; like Mr. Spock, his guesses are better than most people's facts. See Anand's article here:
www.anandtech.com/show/7603/mac-pro-review-late-2013 .
I/O or input/output is the term used for the part of the architecture of a computer system necessary to get data into and out of the CPU in sufficient quantities and fast enough to support the capabilities of the CPU. Fast I/O has been a central issue since the days of big mainframe computers, and remains an issue with todays workstation computers.
The new Mac Pro is powered by an Intel Xeon E5 processor with number crunching assistance from two FirePro GPUs. Together, these processors can perform teraflops; that's trillions of math operations per second; if they aren't choked by slow I/O. So, let's use Anand's efforts to examine the I/O in the Mac Pro.
Starting with the Xeon E5 CPU, which has 40 lanes of PCIe 3.0 I/O, along with four RAM channels, and a 2 MB/s interconnect (or Direct Media Interface in Intel parlance) to other chips. This is shown in the following diagram:
Four RAM channels? The memory controller is now built into the E5, rather than a separate chip. So, Apple is limited to four RAM slots, unless they want to multiplex the memory channels, which will impact memory access speed. This is a limitation Intel built into their chips.
Next is the 40 lanes of PCIe 3.0, used for the GPUs and Thunderbolt I/O. PCIe 3.0 runs at 985 gigabytes per second, an impressive number. Those super fast GPUs each need 16 lanes, using a total of 32 of the 40 lanes.
This leaves 8 lanes for the Thunderbolt 2. ports. There are six Thunderbolt 2 ports, each needing 2 lanes of PICIe 3.0. Anand explains how the six ports are controlled from three TB controllers in pairs; this is important to those plugging peripherals into the Mac Pro (read Anand's article). For us, it is important that there are six ports requiring 12 lanes of PCIe 3.0 I/O, and 8 lanes are available. Oops.
This means the Thunderbolt ports could be choked if you have very many peripherals plugged in, possibly limiting the performance of external drives you rely on in lieu of the single internal SSD. And, who knows if this architecture somehow limits the performance of an external TB drive, even if it's the only thing plugged in? Only future performance tests will give some indication of this.
Another strike against Intel, and I haven't yet told you why the Mac Pro is limited to the single internal SSD.
We need more I/O for the SSD, ethernet connections and USB. This is where that Xeon interconnect comes in. It is connected to a PCH or Platform Controller Hub, which is a PCIe hub. The one used by Apple in the Mac Pro adds 8 lanes of PCIe I/O. Not PCIe 3.0, but PCI 2.0! Intel doesn't bother to make a hub for these workstations that use PCIe 3.0. Intel, you aren't winning any friends for yourself!
This little PCH has 8 lanes for the remaining I/O we must have. Two lanes are used for the two Gigabit ethernet ports. One lane is used for the WiFi controller, and one lane is used for the four USB 3.0 ports.
Hmmm. One lane for four USB ports? PCIe 2.0 runs at 500 MB/s. USB 3 is claimed to run at 500 to 640 MB/s, but experience tells us USB peripherals will typically top out at half that. So, we'll be conservative and say we need 250 MB/s per port. Four ports then need 1000 MB/s, if all four are used, and they are hooked into a single PCI lane providing 500 MB/s. This does not compute! Another bottleneck in Intel's architecture.
That leaves us with 4 PCIe 2.0 lanes for the SSD, which needs … 4 lanes or a total of 2 GB/s. Do you remember that the interconnect from the PCX back to the Xeon E5 runs at 2 GB/s? So, the SSD can theoretically run at full speed, if you aren't running *any* I/O for the ethernet, WiFi or USB ports.
And, there is no room on the PCX for a second SSD. So, the lack of a second SSD is Intel's fault, not Apple's. Intel is shipping CPUs with PCIe 3.0, and no I/O support at those bandwidths.
What could Apple do to solve these I/O problems? Some have suggested the Mac Pro should have two CPUs, along with the two GPUs. It looks like the cost of two Xeons with 4 cores each wouldn't be much more expensive that a single Xeon with 8 cores, and so on. This could really help the I/O problem if the system software could keep performance numbers up. But, there is still that nasty problem of the interconnect being limited to 2 GB/s. That would slow down communication between two CPUs using PCIe 3.0.
The bottom line is that Intel needs to do a much better job. And, don't get me started on the lack of Haswell support in the Xeon chips months after the low end chips have it!