Tuesday, 29 April 2008

Boot Camp vs Parallels vs VMWare Fusion

Update: (2009/06/01) This post is quite old now, and both companies have released new versions. As a very brief summary, I'm actually finding the performance pretty similar these days (they've both obviously put a lot of work into the areas they lagged behind the other) so I'd say it's now much more an issue of cost, eye candy and utility features (rather than performance) to decide on one or the other. Personally, I'm using Parallels v4... for now (;

I did some benchmarks a while back (August 2007) that I had always intended to post online somewhere, and since I've finally gone and made this blog I'm posting them, albeit a little late.

Now firstly, there have been updates to Parallels and Fusion since August '07, so these benchmarks are out of date, but I don't think that makes them totally worthless. Having updated Parallels and Fusion in April this year, and tested both with Visual Studio 2005, my (subjective) feeling is that the performance (at least for compiling a C++ project in Visual Studio in Windows) has changed very little since August last year. Still, I'm supplying these benchmarks with a few grains of salt...

I ran these benchmarks using SiSoft Sandra, which I've always liked for benchmarking (you get each measure individually, and it doesn't try to combine them all into a single weighted average that has no real-world measurable value - but then again, as an Engineer and Physicist I may be a little biased against calculations that fail on dimensional analysis... sorry, I digress...).

My platform is a MacBook Pro 15", 2.2 GHz Core 2 Duo (4MB), 800 MHz Bus, 2 GiB 667 MHz RAM (2x1 GiB DIMMs), GeForce 8600M GT 128MB 16xPCIe (and so on, as per the stock MacBook Pro from August '07 with the 160 GB HDD upgrade).

The benchmarks were run in SiSoft Sandra as mentioned above, in a Windows XP machine on my MacBook Pro. It was actually the same Windows XP install for all three setups - I installed Windows in Boot Camp and was able to run the system either by launching one of the Virtual Machines (Parallels Desktop or VMWare Fusion) or by booting in via Boot Camp. The upside of this was consistency (and for me, ease of switching between VMs/Boot Camp for my actual work), the downside is that any hardware detection that takes place only at install time may cause a lower than optimal performance on one of the VMs (I mention this only because I recall something about needing to re-install Windows XP if you upgrade to a multi-core processor or one that supports certain new CPU instruction sets if you want to make use of the extra cores/instructions, so it might be relevant... I'm not sure). This would also therefore affect the disk performance for the VMs, since they're directly accessing the disk rather than a virtual disk.

The software versions at the time of benchmarking were
  • Parallels Desktop 3.0 build 4560
  • VMWare Fusion 1.0 (either RC1 or build 51348 - can't remember which)
  • Boot Camp (beta 1.3 or 1.4 - can't remember which)
For the record, current versions are Parallels 3.0 build 5584, Fusion 1.1.2, Boot Camp 2.1.

Where possible the machines were set up identically, however Parallels still only supports a single processor for virtualisation. This obviously caused some major differences in the benchmarks, but is not necessarily a Bad Thing™ when you're using a VM. A single core is fine if the application(s) you use in the VM are, on the whole, restricted by a single thread.

So down to the benchmarks... In each of the images below, the systems are represented as follows
  • Parallels - Red bars/lines, labelled "Current X" (e.g. Current Processor) - as the screenshots were taken from Parallels.
  • VMWare Fusion - Orange bars/lines, labelled "VMWare Fusion 1.0"
  • Boot Camp - Green, labelled "BootCamp"
Apologies to the colour-blind; some of the graphs make it very difficult to differentiate between the data, and it didn't really help that I left a couple of other similar-performing devices in the graphs for comparison... ignore the blue and magenta scores (;

For a good (professional) explanation of these benchmarks (by the people who made them) you should check out SiSoft's Benchmarking 101 Pages.

First up, Processor Multi-Media - Fusion runs almost native speed, Parallels falls way behind at about 25% of native. Now given Parallels is only running with one processor, one would expect it to be half the native speed, so it seems as if the one core is running about half the speed (when it comes to Multi-Media instructions) compared to the cores when running natively or in Fusion.

Processor Arithmetic - Again Fusion is close to native speed (92.2% Dhrystone, 93.0% Whetstone), and Parallels, as expected, is sitting at about half native speed due to the single processor virtualisation (47.6% Dhrystone, 54,0% Whetstone). Actually, 'per-core' Parallels outperforms Fusion here by a few per cent points.
Multi-Core Efficiency - Parallels doesn't show up in this graph, but again you can see Fusion falls just short of the native Boot Camp scores (looks to be around 90% of native values).
Cache and Memory - This benchmark shows memory bandwidth as a function of the block size, and indicates the speeds of the different levels of cache (note the significant steps around 32-64 kiB and 4 MiB). The two VMs perform at pretty much native speed above 128 kiB but things are very different below 32 kiB. Parallels is about 60-70% of native speed while Fusion seems to be about 40-50% faster than native speed in this range. My guess is its actually something odd going on with the timing if this benchmark within the VM. I can't imagine it would be possible to generate memory bandwidths beyond that of the native L1 cache! My guess is the VMWare graph basically follows the native one in the 'real world' for cache speeds.
Memory Bandwidth - This one's quite surprising, as both VMs exceed the native memory bandwidth for both integer and floating point benchmarks. Perhaps (at least in the Boot Camp beta) Apple's drivers for Windows weren't quite up to date. For the integer benchmark, Parallels beats Fusion by 38%, and for the floating point benchmark, Fusion beats Parallels by 13.5%. Unless you really know what your applications are doing in terms of memory access, these results don't really give a clear winner...
Memory Latency (Linear) - Another benchmark that clearly indicates the different cache levels (note the jumps between 16 and 64 kiB, and again either side of 4 MiB). Parallels tends to sit one (or so) clock cycle faster than Fusion for the whole range, and seems to get further ahead as the test range size increases. Parallels only really deviates from native speed at/around the 4 MiB mark (i.e. the size of the L2 cache). From memory the graph is the same (or very similar) for the second processor in Boot Camp and Fusion (not worth graphing twice).
Memory Latency (Random) - Again, Parallels is slightly faster than Fusion for all the test range sizes, and the native Boot Camp benchmark jumps ahead significantly at/around the 4 MiB L2 cache limit.
File Systems - Fairly notable differences between the three platforms here, with Fusion about 20% faster than Parallels, and native Boot Camp disk access about 20% faster than Fusion.
Summary - Certainly back in August 2007 (drivers may have been updated since...) there was no clear winner between Boot Camp, Parallels Desktop and VMware Fusion for all-round performance. Boot Camp came top of everything except memory bandwidth (but lost by quite a large margin to both VMs), Parallels tended to perform better than Fusion in memory and cache access, while Fusion's dual-core virtualisation put it a long way ahead of Parallels for raw CPU performance.

From my personal experience (here comes the subjective bit...) running Windows XP in Parallels and Fusion, even though Parallels only supports a single processor for virtualisation, Fusion seems to use a lot of its second core for processing graphics. I'm not 100% sure of this, but Parallels seems to do a better job (and this might be due to its OpenGL support? since Fusion lacks OpenGL support... clutching at straws though) of passing the work off to the graphics card, while Fusion does more of the rendering in CPU-land. I've seen Fusion sitting there using something like 80% of one processor while Windows is sitting there doing nothing but displaying a few windows; Parallels was down at 10-20% of the one processor (though the GPU does produce plenty of heat while its running).

A more objective test, and this is what has me sold (for now...) is that the C++ library I spend a lot of time using at work takes about 3-4 minutes to compile in Visual Studio 2005 under Parallels, and about 12 minutes under VMWare Fusion. This is a test I ran a few weeks back (April 2008) with the latest version of both Parallels and Fusion, and I came to the same conclusion (that compiling a large C++ project in VS2005 was faster in Parallels than Fusion) in August 2007 as well. As for Boot Camp, I've given up on that completely - I hate rebooting (and the fact that Windows thinks the system clock should be in local time for some really stupid reasons, despite good arguments to the contrary).

I guess this suggests that compiling (at least the combination of C++, VS2005, Windows XP, and our code-base) is more memory-bandwidth/latency-bound than processor/disk-speed-bound.

Fortunately, I've ported this library and some of our applications to Mac OS X (thanks to Leopard's POSIX compliance) so I don't have to spend much time in the VMs any more (;

On a final note, I've noticed that a lot of people compare them by their features, interface, 'feel' and such, saying that one or the other feels more like a Mac program, or feels more robust, etc. I've avoided any mention of these because I didn't find them different enough in these areas to be able to suggest everyone (or even a significant majority) would feel the same way on these subjective matters.

Apart from graphics support and number of CPUs virtualised, they're very very similar applications on the surface, though performance can differ quite a bit for applications that are bound to the performance of specific components of the system.

2 comments:

Ali said...

great post, thanks for the tips! i'm actually trying to do the visual studio thing on my mac, so that last part really helps me decide to give parallels a try.

Anonymous said...

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!