2009-09-23

Nehalem Processors Are Great



A client came to me a few months ago requesting help with a problem they had.

They have a system that does some (I guess you can call it) grid computing. They were using 20 desktops with XP to compute perform some calculations, and the the process was taking 10 hours , utilizing ~100% CPU on each machine throughout the process.

Because of  a change that was made in the algorithm, the process would now take 22 hours to complete, for the same amount of calculations and that was not an acceptable result.

We wanted to test, would it be possible to cram a large amount of Virtual machines to do the work and we came up with the following solution:

  1. 3 IBM x3550 Dual E5430 Processors
  2. 8 GB RAM
  3. ESX3i on each of them
  4. 8 Windows Server 2003 VM's on each server (total of 24 VM's)

We saw that when the Virtual machines were busy doing the calculation process, they were utilizing 100% of the vCPU, bringing the host to very close to 100% utilization during the calculations. This was the hard limit of 8 VM's on each host.

The amazing thing was that the same run that would have taken 22 hours on physical Desktops now took on these three servers 8 hours (a 275% increase in performance). The client was thrilled!

We tested different configurations of more VM's with lower CPU limits, but since the utilization was 100% regardless of the speed of the vCPU, the results stayed that the optimized configuration was 8 VM's per host.

The cost for the whole design (including a test Server and a Management sever) was

prices1

Fast forward 6 months. Budget issues, etc. etc.

Client comes to say that they are ready to go forward with the project. But there was one slight problem. During that period IBM announced that they were going over to a new series of Servers (x3550M2) and that the old ones were no longer available for purchase.

And also during that period ESX4 was released.

New servers were tested with the same data as before, this time on a x3550M2 with Dual E5530 Processors.

We fired up the tests with the same 8 VM's as before. the results were pretty much the same. Except for one small thing. We were seeing that the the CPU was only 50% utilized (or more correctly only 50% of the cores were being used) Huh??

Where did I get another 8 cores from? The answer - Hyper threading!

With Hyper threading enabled -  the machine recognized 16 cores. So we deployed another 8 VM's (16 in total on one host). And of course there was no problem of RAM, because since all the machines are exactly the same,

I am not saying that in all use cases this will work, but this one did. We ran the tests, and instead of the results we had with the previous hardware of 8 hours, the job was now complete in 5 (a 60% increase in performance). With this metric, we now could reduce the number of physical Hosts from 3 down to 2. Also the pricing for VMware Software that was on the original configuration (Foundation Accelerator Kit) was now replaced with the VMware Essentials at a lower price (40%). In the same configuration we now configured the system with 2 ESX hosts and 32 VM's.

the new price for the project.

prices2

So yes, the Nehalem Processors, are a good thing, and in this specific case - it managed to lower the costs and boost the performance.

Hope you enjoyed the ride!

6 comments:

iguy said...

Any ability to test with AMD's Istanbul or better?

michael said...

where does decrease in microsoft costs come from? the number of vms increased from 24 to 32.

Maish said...

iguy - I am sorry, but I do not have access to Istanbul processors at the moment, so I cannot give you results regarding these processors.

Maish said...

Thanks for your question Michael.

The licenses that were purchased for these ESX servers was a Datacenter License, which is licensed per Physical Socket.

Because of the reduction in Hardware - from 3 to 2 servers I could get reduce the amount Windows Server DataCenter Licenses that the client needed for the project.

And of course the beauty of it all is that the Windows Server Datacenter License allows for an unlimited amount of Windows Server OS instances to be run on that Physical host, so the increase in the number of Virtual machines here is irrelevant.

michael said...

thanks

PiroNet said...

Nice post, and following Moore's law, in 18 months you will be able to do it in less time. Hardware wise, eventually a single server will suffice... I love technology :)