The Problem With Vendor Sponsored Testing

I suppose this post has been a long time coming.

It was spurred into reality by an exchange with @bladeguy who pointed out that Cisco, too, sponsors tests of their equipment – just like HP and the Tolly reports.  At first, I’d intended to do a comparison of the Tolly reports and the Principled Technologies reports, looking for obvious (or not so obvious) bias.   Once I started down that path, however, I realized it really wasn’t necessary.   Sponsored tests (from any organization) will always be biased, and therefore unreliable from a technical perspective.   There are always tuning parameters that the “loser” will insist were wrong which skewed the results, there are always different ways to architect the test that would have given the “loser” an edge.   That’s why they’re sponsored tests.

I commented briefly before on Tolly’s first HP vs. Cisco UCS report, so I won’t repeat it again here.  Suffice it to say, the bloggers and such did a pretty good job of chopping it up.

My issue with the Tolly reports I’ve seen thus far, the bandwidth debacle and the airflow test, simply don’t pass the smell test.   Yes, they’re repeatable.   Yes, the testing results are defensible.   But the conclusions and declarations of a “winner”?  Not so much.   I’m not faulting Tolly here.   They’re doing their job, producing the product that their customer has asked them for.  These tests aren’t sponsored by the HP engineering groups (for whom I have a lot of respect) looking to validate their technological prowess – they’re sponsored by the marketing departments to provide ammunition for the sales process.   As such, do you really think they’re going into this looking for a fair fight?   Of course not.   They’re going to stack the deck in their favor as much as they think they can get away with (and knowing marketing departments, more than they can get away with).   That’s what marketing groups (from every vendor) do.

@bladeguy pointed out that Cisco has engaged Principled Technologies to do some testing of UCS equipment versus both legacy and current HP equipment.  At first glance, I didn’t detect any significant bias – especially in the tests comparing legacy equipment to current UCS gear.   I’m not sure how any bias could be construed, since really they’re just showing the advantage and consolidation ratios capable when moving from old gear to new gear.   Obviously you can’t compare against Cisco’s legacy servers (since there aren’t any), and HP servers are the logical choice since they have a huge server market share.   I would suspect that similar results would have been achieved when comparing legacy HP equipment against current HP equipment as well.   HP can (and I’m sure has) perform similar tests if they’d like to demonstrate that.

The more troublesome tests are those comparing current generations of equipment from two manufacturers.   The sponsor of the test will always win, or that report will never see the light of day.  That’s simply how it works.   Companies like Tolly, Principled Technologies, etc aren’t going to bite the hand that feeds them.   As such, they’re very careful to construct the tests such that the sponsor will prevail.   This is no secret in the industry.   It’s been discussed many times before.

Even the Principled Technologies tests that compared current generations of hardware looked like pretty fair fights to me.  If you look closely at the specifications of the tested systems, they really tend to reveal the benefits of more memory, or other such considerations, as opposed to the hardware itself.   @bladeguy pointed out several items in the Principled Technologies tests that, in his opinion, skewed the results towards Cisco.   I’m not in any position to refute his claims – but the items he mentioned really come down to tuning.   So essentially he’s saying that the HP equipment in the test wasn’t tuned properly, and I’m certainly not going to argue that point.   As a sponsored test, the sponsor will be victorious.

And therein lies the problem.   Sponsored tests are meaningless, from any vendor.   I simply don’t believe that sponsored tests provide value to the technical community.  But that’s ok – they’re not targeted at the technical community.   They’re marketing tools, used by sales and marketing teams to sway the opinions of management decision makers with lots of “independent” results.    If I want to know which server platform is better for my environment, I’m going to do my own research, and if necessary invite the vendors in for a bake-off.   Prove it to me, with your tuning as necessary, and I’ll have the other vendors do the same.

My real problem with these tests, understanding that they’re not aimed at the technical community, is the that many in the technical community use them to “prove” that their platform is superior to whoever their competing against at the moment.   Like politics, these kinds of arguments just make me tired.   Anyone coming into the argument already has their side picked – no amount of discussion is going to change their mind.   My point for blogging about UCS is not to sell it – I don’t sell gear.   It’s because I believe in the platform and enjoy educating about it.

I happen to prefer Cisco UCS, yes.   If you’ve ever been in one of my UCS classes, you’ll have also heard me say that HP and IBM – Cisco’s chief rivals in this space – also make excellent equipment with some very compelling technologies.   The eventual “best” solution simply comes down to what’s right for your organization.   I understand why these sponsored tests exist, but to me, they actually lessen the position of sponsor.   They make me wonder, “if your product is so good, why stack the deck in the test?”   The answer to that, of course, is that the engineers aren’t the ones requesting or sponsoring the test.

As came up in my UCS class today, for the vast majority of data center workloads, the small differences in performance that you might be able to get out of Vendor X or Vendor Y’s server is completely meaningless.   When I used to sell/install storage, I used to get asked the question as to which company’s storage to buy if the customer wanted maximum performance.  My answer, every single time was, “HP, or IBM, or HDS, or EMC, or…”   Why?  Because technology companies are always leapfrogging each other with IOPS and MB/s and any other metric you can think of.   What’s the fastest today in a particular set of circumstances will get replaced tomorrow by someone else.

So what’s the solution?  Well, true independent testing, of course.   How do you do true independent testing?   You get a mediator (Tolly, Principled Technologies, etc are fine), representatives from both manufacturers to agree on the testing criteria, and allow each manufacturer to submit their own architecture and tuning to meet the testing criteria.   The mediator then performs the tests.    The results are published with the opportunity for both manufacturers to respond to the results.   Think any marketing organization from any company would ever allow this to happen?   The standard line in the testing industry is “Vendor X was invited to participate, but declined.”  Of course they declined.   They’ve already lost before the first test was run.   I wouldn’t expect Cisco to participate in a HP-sponsored Tolly test any more than I’d expect HP to participate in a Cisco-sponsored Principled Technologies test.

Don’t chase that 1% performance delta.   You’ll just waste time and money.   Find the solution that meets your organizational needs, provides you the best management and support options, and buy it.   Let some other chump chase that magical “fastest” unicorn.   It doesn’t exist.  As in all things IT, “it depends.”

All comments are welcome, including those from testing organizations!

Private Isolated VSANs?

Ok, so this isn’t really UCS related.   Just a random thought I had today while working on a lab project… why don’t we have Private VSANs?   As in, the same type of technology as Private VLANs (PVLANs)?

First, some background.   Standard SAN best practice for access control is to use single-initiator/single-target zoning.   This means that there’s one zone for each combination of host and storage, tape, virtualization platform, etc port.    Some administrators think this is overkill, and create just a few zones of lots of initiators to single targets, but this is overall a bad idea.   The purpose of this post is not to argue for single-initiator zoning, since it’s accepted recommended practice.

Private VLANs provide a method for simplifying access control within a L2 Ethernet domain, restricting access between nodes.   Community PVLANs allow communication only between members of the same community, and the promiscuous port(s).   This is actually fairly close to the idea of a fibre channel zone, with the distinction that fibre channel doesn’t have promiscuous ports.   Isolated PVLANs allow communication only between each individual node and the promiscuous port(s).   In a way, you could compare this to having a lot of nodes (initiators) zoned only to a single target node (target) in fibre channel – but without the administrative overhead of zoning.

So, why not combine these approaches?   Having the concept of an Isolated Private VSAN would simplify some types of fibre channel deployments, by enforcing recommended practices around access control without the administrative overhead.  In a smaller environment, you could simply create an Isolated Private VSAN to contain the ports for a given fabric – set the storage ports as promiscuous, and all node ports would be restricted to connecting only to the storage ports – and prevented from communicating with each other.   In fact, I’d imagine that this would be enforced with standard FC zoning (since that’s the hosts are expecting when they query the name server anyway) – really we’d just be automating the creation of the zones.   Cisco already does something similar by automatically creating zones when doing Inter-VSAN Routing (IVR).

For slightly larger environments, I could even see adding in the idea of Community Private VSANs – whereby you group initiators and specify specific target (promiscuous) ports per community – without having to add additional VSANs.

Now that I’m thinking out-loud, why not have isolated zones instead?   Mark a zone as “isolated”, and tag any necessary WWNs/ports/etc as promiscuous, and enforce the traditional zoning behind the scenes.

True, this approach wouldn’t accomplish anything that traditional VSANs and zoning do not.  The implementation would likely have to use traditional zoning behind the scenes.   Just as PVLANs aren’t used in every situation, nor would PVSANs, but I could definitely see some use cases here.  So what do you think?   Am I completely insane?   Thoughts, comments, rebukes are all welcome.  🙂

UCSM 1.3(1c) Released!

Cisco has released UCS Manager version 1.3(1c).   This is the first public release in the 1.3 line, also known as “Aptos+”.

Release notes are here: http://www.cisco.com/en/US/docs/unified_computing/ucs/release/notes/ucs_22863.html

Haven’t gotten a chance to play with the new version yet, but there are some significant enhancements.    Among them…

  • 1 GE support on UCS6120 and UCS6140 Fabric Interconnects
    • On the 6120, you can now use 1GE transceivers in the first 8 physical ports.
    • On the 6140, you can now use 1GE transceivers in the first 16 physical ports.
    • Watch for a post soon on why I think this is a bad idea.  🙂
  • Support for the new, 2nd generation mezzanine cards
    • Both Emulex and Qlogic have produced a 2nd generation mezzanine card, using a single-chip design which should lower power consumption
      • Be warned that these new mezzanine cards won’t support the “Fabric Failover” feature as supported by the first generation CNAs, or by the VIC (Palo) adapter
      • These aren’t shipping quite yet, but will be soon
    • A Broadcom BCM57711 mezzanine adapter
      • This will compete with the Intel based, 10GE mezzanine adapters that UCS has had until now
      • The Broadcom card supports TOE (TCP Offload Engine) and iSCSI Offload, but not iSCSI boot
    • An updated Intel mezzanine adapter, based on the Niantic chipset
  • Support for the B440-M1 blade
    • The B440 blade will be available in a 2 or 4 processor configuration, using the Intel Xeon 7500 processors
    • Up to 4 SFFP hard drives
    • 32 DIMM slots, for up to 256GB of memory
    • 2 Mezzanine slots
    • Full-width form factor
  • SSD hard drive support in B200-M2, B250-M2, and B440-M1 blades
    • First drive available is a Samsung 100GB SSD
  • Improved SNMP support
  • Ability to configure more BIOS options, such as virtualization options, through the service profiles
    • This is a big step towards making UCS blades honestly and truly stateless
    • Previously, I’d recommended that UCS customers configure each blade’s BIOS options to support virtualization when they received them, whether or not they were going to use ESX/etc on all of the blades.  This way they didn’t have to worry about setting them again when moving service profiles
  • Support for heterogeneous mezzanine adapters in full-width blades
  • Increased the supported limit of chassis to 14.
  • Increased the limit of VLANs in UCSM to 512
    • There’s been some discussion around this lately, particular in the service provider space.   Many service providers need many more VLANs than this for their architectures.
    • I’ve seen reference to a workaround using ESX, Nexus 1000V, private VLANs, and a promiscuous VLAN through the Fabric Interconnect into an upstream switch, but I’m still trying to get my head around that one.  🙂
  • Ability to cap power levels per blade
    • Will have to wait until I get a chance to test out the code level to see what kinds of options are available here

Looking forward to seeing customer reaction to the new features.

Correction to L2 Forwarding Rules post

I posted here about the L2 forwarding rules when UCS is in EHV mode.   Several readers have pointed out a flaw in the logic I posted, which was taken from Cisco’s DCUCI course.   In Cisco’s defense, I did write that course.   🙂

At issue is how UCS deals with unknown unicast frames.   The other post incorrectly states than an unknown unicast frame received from a server port would be flooded out all other server ports participating in the same VLAN.   This is not the case.

The logic behind EHV mode is that it is impossible to have an unknown unicast address “behind” or “south” of the Fabric Interconnect.   All adapter MAC addresses are known, either because they were assigned in the service profile or inventoried (if using “derived” values).    For MAC addresses that are generated within a host, say for a virtual machine, the assumption is that at creation (or arrival through vMotion, etc) the MAC address will be announced using gratuitous ARP or other traffic generation techniques and the Fabric Interconnect can learn the address through normal L2 methods.

So to clarify, an unknown unicast frame received from a server port will be flooded out ONLY that interface’s pinned uplink.   Otherwise, all traffic destined for MAC addresses outside of UCS (such as the MAC address of a default gateway, for example) would also get flooded internally – which would not be a good thing.

Great UCS write-up by Joe Onisick

If you’re not currently following Joe’s blog over at definethecloud.wordpress.com, you should start.

He just posted another great article on why UCS is his server platform of choice.   Before you write him off as just another Cisco fan-boy, definitely take a look at his logic.   Even if you have another vendor preference, he presents some excellent points to consider.

Take a look : http://definethecloud.wordpress.com/2010/05/23/why-cisco-ucs-is-my-a-game-server-architecture/

UCS with disjointed L2 Domains

How do we deal with disjointed L2 domains in UCS?

To start, what’s a disjointed L2 domain?  This is where you have two Ethernet “clouds” that never connect, but must be accessed by the same UCS Fabric Interconnect.   Take, for example, a multi-tenant scenario where we have multiple customer’s servers within the same UCS cluster that must access different L2 domains.

How do we ensure that all traffic from Customer A’s blade only goes to their cloud, while Customer B’s blades only connect to their cloud?

The immediately obvious answer is to use UCS pin groups to tie each customers interfaces (through their vNIC configuration) to the uplinks that go to their cloud.   Unfortunately, this only solves half of the problem.

In the default operational mode of the Fabric Interconnects (called Ethernet Host Virtualizer, sometimes called End Host Virtualizer), only one uplink is used to receive multicast or broadcast traffic.   EHV mode assumes a single L2 fabric on the uplinks (VLAN considerations notwithstanding).  So in this example, only broadcasts or multicasts from one of the two fabrics would be accepted.   Obviously, this is a problem.

The only way to get around this is to put the Fabric Interconnects into Ethernet Switching mode.   This causes the Fabric Interconnect to behave as a standard L2 switch, including spanning tree considerations.  Now uplinks can receive broadcasts and multicasts regardless of the fabrics they are connected to.   This does, however, increase the administrative overhead of the Fabric Interconnects and reduces your flexibility in uplink configuration since now we must channel all ports going into the same L2 domain in order to use the bandwidth.

To me, a more ideal situation would be to leave the Fabric Interconnects in EHV mode, and use another L2 switch to perform the split between fabrics, such as the following:

This configuration allows the Fabric Interconnect to remain in EHV mode and has the upstream L2 switches performing the split between the L2 domains.  ACLs can be configured on the L2 switches as necessary to isolate the networks, something that cannot be done on the Fabric Interconnect regardless of mode.

Both of these scenarios assume that each of the two customer L2 clouds are using different VLAN numbering, since there’s no capacity in UCS to distinguish between the same VLAN numbers on either Fabric.   There are certainly L3 and other translation tricks that you could use to accomodate this, but that’s an entirely different post.  🙂

UCS Bookstore

Since my previous announcements about various books on UCS and related topics have gotten rolled off the main page, I thought it would be useful to collect them into a bookstore.   I’ve added a link (see the navigation tabs at the top of the screen) to my UCS Bookstore.    Feel free to have a look, and make any suggestions on books you think should be included.

Defining VN-Link

The misunderstanding of Cisco’s enhanced network products for VMware environments has hit critical mass.  At this point very few people know what does what, how, and when to use it.  Hopefully this will demystify some of it.

VN-Link:

Product name for a family of products, does not specifically refer to any one product so forget the idea of hardware vs. software implementation, etc.  Think of the Nexus family of switches: 1000v, 2000, 4000, 5000, 7000.  All different products solving different design goals but are components of the Data Center 3.0 portfolio.  The separate products that fall under VN-Link are described below:

Nexus 1000v:

The Nexus 1000v is a Cisco software switch for VMware environments.  It is comprised of two components: a Virtual Supervisor Module (VSM) which acts as the control plane, and a Virtual Ethernet Module (VEM) which acts as a data plane.  2 VSM modules operate in an active/standby fashion for HA and each VMware host gets a VEM.  This switch is managed by a Cisco NXOS CLI and looks/smells/feels like a physical switch from a management perspective…that’s the whole point:

‘Network teams, here’s your network back, thanks for letting us borrow it.’  – The Server Team

The Nexus 1000v does not rely on any mystical magic such as VN-Tag (discussed shortly) to write frames.  Standard Ethernet rules apply and MAC based forwarding stays the same.  The software switch itself is proprietary (just like any hardware/software you buy from anyone) but the protocol used is standards based Ethernet.

Hypervisor Bypass/Direct path I/O:

Hypervisor bypass is the ability for a VM to access PCIe adapter hardware directly in order to reduce the overhead on a physical VMware’s hosts CPU.  This functionality can be done with most any PCIe device using VMware’s Direct-Path I/O.  The advantage here is less host CPU/memory overhead for I/O virtualization.  The disadvantage is currently no support for vMotion and limits as to the number of Direct-Path I/O devices per host.  This doesn’t require Cisco hardware or software to do, but Cisco does have a device that makes this more appealing in blade servers with limited PCIe devices (the VIC discussed later.)

Pass Through Switching (PTS):

PTS is a capability of the Cisco UCS blade system.  It relies on management intelligence in the UCS Manager and switching intelligence on each host to pull management of the virtual network into the UCS manager.  This allows a single point of management for the entire access layer including the virtual switching environment, hooray less management overhead more doing something that matters!

PTS directly maps a Virtual Machines virtual NIC to an individual physical NIC port across a virtualized pass-through switch.  No internal switching is done in the VMware environment, instead switching and policy enforcement are handled by the upstream Fabric Interconnect.  What makes this usable is the flexibility on number of interfaces provided by the VIC, discussed next.

Virtual Interface Card (VIC) the card formerly known as Palo:

The virtual interface card is an DCB and FCoE capable I/O card that is able to virtualize the PCIe bus to create multiple interfaces and present them to the operating system.  Theoretically the card can create a mix of 128 virtual Ethernet and Fibre Channel interfaces, but the real usable number is 58.  Don’t get upset about the numbers your operating system can’t even support 58 PCIe devices today ;-).  Each virtual interface is known as a VIF and is presented to the operating system (any OS) as an individual PCIe device.  The operating system can then do anything it chooses and is capable of with the interfaces.  In the example of VMware the VMware OS (yes there is an actual OS installed there on the bare metal underneath the VMs) can then assign those virtual interfaces (VIF) to vSwitches, VM kernel ports, or Service Console ports, as it could with any other physical NIC.  It can also assign them to the 1000v, to be used for Direct-Path I/O, or to use with Pass-Through Switching.  Even more important is the flexibility to use separate VIFs for each of these purposes on the same host (read: none of these is mutually exclusive.)   The VIC relies on VN-Tag for identification of individual VIFs, this is the only technology discussed in this post that uses VN-tag (although there are others.)

VN-Tag:

VN-Tag is a frame tagging method that Cisco has proposed to the IEEE and is used in several Cisco hardware products.  VN-Tag serves two major purposes:

1) It provides individual identification for virtual interfaces (VIF.)

2) It allows a VN-Tag capable Ethernet switch to switch and forward frames for several VIFs sharing a set of uplinks.  For example if VIF 1 and 2 are both using port 1 as an uplink to a VN-Tag capable switch device the VN-Tag allows the switch to forward the frame back down the same link because the destination VIF is different than the source VIF.

VN-Tag has been successfully used in production environments for over a year.  If you’re using a Nexus 2000, you’re already using VN-Tag.  VN-Tag is used by the: Nexus 2000 Series Switches, the UCS I/O Module (IOM), and the Cisco Virtual Interface Card (VIC.)  The switching for these devices is handled by one of the two VN-Tag capable switches: Nexus 5000 or UCS 6100 Fabric interconnect.  Currently all implementations of VN-Tag use hardware to write the tags.

– Joe Onisick (http://www.definethecloud.net)