fcoe – The Unified Computing Blog

Brocade’s Flawed FCoE “Study”

Disclosures
I do not work for Cisco, Brocade, or any of the companies mentioned here. I do work for a reseller that sells some of these products, but this post (as are all posts on this site) is my opinion only, and does not necessarily reflect the views of my employer or any of the manufacturers listed here. Evaluator Group Inc. did invite me to have a call with them to discuss the study via a tweet sent from the @evaluator_group Twitter account. Mr. Fellows also emailed me to offer a call. After my analysis, I deemed such a call unnecessary.

Ok, with that out of the way…

The “Study”

There were quite a few incredulous tweets floating around this week after Brocade publicized an “independent” study performed by Russ Fellows of Evaluator Group Inc. It was also reviewed by Chris Mellor of The Register, which is how I came to know about it. In the review, Mr. Mellor states that “Brocade’s money was well spent,” though I beg to differ.

As of this posting, the study is still available from the Evaluator Group Inc. website, though I would hope that after some measure of peer review, it will be removed given how deeply flawed it is. As I do not have permission to redistribute the study, I will instead suggest that you get a copy at the above link and follow along.

The stated purpose of the study was to compare traditional Fibre Channel (hereafter FC) against Fibre Channel over Ethernet (hereafter FCoE), specifically as a SCSI transport between blade servers and solid state storage. To reduce equipment requirements, only a single path was designed into the test, unlike a production environment that would have at a minimum two. The report further stated that an attempt would be made to keep the amount of bandwidth available to each scenario equal.

The Tech

The vendor of storage was not disclosed, though it should be fairly irrelevant (with one exception to be noted below). The storage was connected via two 16Gb FC links to a Brocade 6510 switch. The Brocade 6510 is a “top of rack” style traditional FC switch that is not capable of FCoE.

The chosen architecture for the FC test was an HP c7000 blade enclosure containing two blades, using a Brocade FC switch. The embedded Brocade switch is connected to the Brocade 6510 via a single 16Gb FC link.

The FCoE test was performed using a Cisco UCS architecture, consisting of a single Fabric Interconnect, connected via 4 10Gb converged Ethernet links to a single blade chassis containing two blades. The Fabric Interconnect is connected to the Brocade 6510 via two 8Gb FC links. As of this writing, the only FC connectivity supported by Cisco UCS is 10Gb FCoE or 1/2/4/8Gb FC.

So what’s the problem?

There are many, many fundamental flaws with the study. I eventually ran out of patience to catalog them individually, so I’m instead going to call out some of the most egregious transgressions.

To start, let’s consider testing methodology. The stated purpose of this test was to evaluate storage connectivity options, narrowed to FC and FCoE. It was not presented as a comparison of server vendors. As such, as many variables as possible should be eliminated to isolate the effects of the protocol and transport. This is the first place that this study breaks down. Why was Cisco UCS chosen? If the effects of protocol and transport are truly the goal of the test, why would the HP c7000 not also be the best choice? There are several ways to achieve FCoE in a c7000, both externally and internally.

The storage in use is connected via two 16Gb FC links. The stated reason for this is that the majority of storage deployments still use FC instead of FCoE, which is certainly true. The selection of the Brocade 6510 is interesting, however, in that Brocade has other switches that would have been capable of supporting FCoE and FC simultaneously. It’s clear that the choice of an FC only switch was designed to force the FCoE traffic to be de-encapsulated before going to the storage. Already we can see that we are not testing FC vs. FCoE, but rather FC natively end to end vs. one hop of FCoE. Even so, the latency and performance impact caused by the encapsulation of the FC protocol into Ethernet is negligible. The storage vendor was not disclosed, and as such, I do not know if it could have also supported FCoE, making for a true end-to-end FCoE test. Despite the study’s claim, end-to-end FCoE is not immature and has been successfully deployed by many customers.

In UCS architecture, all traffic is converged between the blade chassis and the Fabric Interconnect. All switching, management, configuration, etc, occurs within the Fabric Interconnect. The use of four 10Gb Ethernet links between the chassis and Fabric Interconnect is significant overkill given the stated goal of maintaining similar bandwidth between the tests. At worst, two links would have been required to provide each blade with a dedicated 10Gb of bandwidth. Presumably, the decision to go with four was so that the claim could be made that more bandwidth was made available per blade than was available to the 16Gb-capable blades in the HP solution. The study did not disclose the logical configuration of the UCS blades, but the performance data suggests a configuration of a single vHBA per blade. In this configuration, the vHBA would follow a single 10Gb path from the blade to the Fabric Interconnect (via the IO Module), and would in turn be pinned to a single 8Gb FC uplink. Already you can see that regardless of the number of links provided from chassis to Fabric Interconnect, the bottleneck will be the 8Gb FC uplink. The second blade’s vHBA would be pinned (automatically, mind you) to the second 8Gb FC uplink. Essentially in this configuration, each blade has 8Gb of FC bandwith to the Brocade 6510 switch. The VIC 1240 converged network adapter (CNA) on the blade is capable of 20Gb of bandwidth to each fabric. The creation of a second vHBA and allowing the operating system to load balance across them would have provided more bandwidth. The study mentions the use of a software FCoE initiator as being part of the reason for increased CPU utilization.

We didn’t understand the technology, but…

In the “ease of use” comparison, it was noted that the HP environment was configured in three hours, whereas it took eight hours to configure UCS. The study makes it clear that they did not have the requisite skill to configure UCS and required the support of an outside VAR (who was not named) to complete the configuration. The study also states that the HP was configured without assistance. Clearly the engineering team involved here was skilled in HP and not UCS. How this reflects poorly on the product (and especially FC vs. FCoE – that’s the point, right?) is beyond me. I can personally (and have) configure a UCS environment like this in well under an hour. It would probably take me eight hours to perform similar configuration on an HP system, given my lack of hands-on experience in configuring them. This is not a flaw of the HP product, and I wouldn’t penalize it as such. (There are lots of reasons I like UCS over HP c7000, but that’s significantly beyond the scope of this post)

Many of the “ease of use” characteristics cited reflected an all Brocade environment – similar efficiencies would have existed in an all Cisco environment as well, which the study neglected to test.

A software what?

The study observes a spike in CPU utilization with increased link utilization, which is (incorrectly) attributed to the use of a software FCoE initiator. This one point threw me (and others) off quite a bit, as it is extremely rare to use a software FCoE initiator, and non-existent when FCoE capable hardware is present (such as the VIC 1240 in use here). After a number of confusing tweets from the @evaluator_group twitter account, it became clear that while they say they were using a software initiator, it was a misunderstanding of the Cisco VIC 1240 – again pointing to a lack of skill and experience with the product. My suspicion is that the spike in CPU utilization (and latency, and corresponding increase in response times) occurred not due to the FCoE protocol, but rather the queuing that was required when the two 8Gb FC links (total of 13.6Gb/s total bandwidth available, though not aggregated – each vHBA will be pinned to one uplink) became saturated. This is entirely consistent with observed application/storage performance when the links are saturated. This is entirely speculation, however, as the logical configuration of the UCS was not provided. Despite there being similar total bandwidth available, neither server would have been able to burst above 6.8Gb/s, leading to queuing (and the accompanying latency/response impact).

Is that all?

I could go on and on with individual points that were wrong, misleading, or poorly designed, but I don’t actually think it’s necessary. Once the real purpose of the test (Brocade vs. Cisco) became clear, every conclusion reached in the FC vs. FCoE discussion (however incorrect) is moot.

If Brocade really wants to fund an FC vs. FCoE study that will stand up to scrutiny, it needs to use the same servers (no details were provided on specific CPUs in use – they could have been wildly different for all we know), the same chassis, and really isolate the protocol as they claimed to do. Here’s the really sad part – Brocade could have proven what they wanted to (that 16Gb FC is faster than 10Gb FCoE) in a fair fight. Take the same HP chassis used for the FC test, and put in an FCoE module (with CNAs on the servers) instead. Connect via FCoE to a Brocade FCoE capable switch, and use FCoE capable storage. Despite the test’s claim, there’s a lot of FCoE storage out there in production – just ask NetApp and EMC. At comparable cable counts, 16Gb FC will be faster than 10Gb FCoE. What a shock, huh? Instead, this extraordinarily flawed “study” has cost Brocade and unfortunately Evaluator Group Inc. a lot of credibility.

I’m not anti-Brocade (though I do prefer MDS for FC switching, which is not news to anyone who knows me), I’m not anti-FC (I still like it a lot, though I think pure FC networks’ days are numbered), I’m just really, really anti-FUD. Compete on tech, compete on features, compete on value, compete on price, compete on whatever it is that makes you different. Just don’t do it in a misleading, dishonest way. Respect your customers enough to know they’ll see through blatant misrepresentations, and respect your products enough to let them compete fairly.
—
Updated: Check out Tony Bourke’s great response here.

FCoE vs. iSCSI vs. NFS

The following was just a short note I wrote in an internal discussion about FCoE vs. iSCSI vs. NFS – and spurred by Tony Bourke’s discussion about methods for implementing FCoE.

This wasn’t intended to be a detailed analysis, just a couple of random musings. Comments as always are welcome.

—–

While NFS and iSCSI are completely different approaches to accessing
storage, they both “suffer” from the same ailment – TCP. Remember folks,
TCP was developed in the 70’s for the express purpose of connecting
disparate networks over long, latent, and likely unreliable links. The
overheads placed onto communication solely to address these criteria
simply aren’t appropriate in the datacenter. We’re talking about a
protocol written to support links slower than your Bluetooth headset. 🙂

iSCSI is a hack, plain and simple. It solves a cost problem, not a
technology one. Even its name is misleading – iSCSI. It isn’t SCSI over
IP – it’s SCSI over TCP over IP. So call it tSCSI or tiSCSI.

I’m not saying they’re not “good enough”, but why do “good enough” now
that “better” is getting much closer in price? On the array side, I
expect more and more vendors to go the NetApp route – all protocols in one
box – just turn on which ones you want to use (via appropriate licensing,
of course). 10G DCB makes this even easier and more attractive – one
port, you pick the protocol you’re comfortable with.

As one of my coworkers points out, FCoE is a bit of a cannon – and for many customers,
their storage challenges are more in mosquito scale.

Fibre Channel was developed with storage in mind as a datacenter protocol,
I haven’t seen one yet I like better for moving SCSI commands around *in
the datacenter*.   I’m sure someone will develop a new protocol at some
point that utilizes DCB-specific architectures to replace iSCSI and
FCoE… but why?   If you want a high performance, low latency,
made-for-storage protocol, run FC over whatever wire you feel like.   If
you want a low-cost solution utilizing commodity
hardware/switching/routing, use iSCSI and/or NFS.   I don’t know that
there’s a new problem to solve here.

For customers that already have and know FC, FCoE is a no-brainer.
Nothing new to learn about how to control access, you’re just replacing
the wires. iSCSI and NFS introduce whole new mechanisms and mindsets into
accessing storage if you’re not used to them.

I saw a quote the other day that said that Fibre Channel is like smoking –
if you’re not already doing it, there’s no reason to start now. I get
the sentiment, but I don’t agree. FC as a protocol is the right tool for
a lot of jobs – but it’s not the right tool for every job.

Update on the 8Gb FC vs. 10Gb FCoE Discussion

By far, the most popular post on this blog has been my discussion on the various protocol efficiencies between native 8 Gb/s Fibre Channel and 10 Gb/s Ethernet using Fibre Channel over Ethernet encapsulation. I wrote the original post as much as an exercise in the logic as it was an attempt to educate. I find that if I can’t explain a subject well, I don’t yet understand it. Well, as it’s been pointed out in the comments of that post, there were some things that I missed or just had simply wrong. That’s cool – so let’s take another stab at this. While I may have been wrong on a few points, the original premise still stands – on a per Gb/s basis, 10Gb FCoE is still more efficient than 8Gb FC. In fact, it’s even better than I’d originally contended.

One of the mistakes I made in my original post was to start throwing around numbers without setting any sort of baseline for comparison. Technology vendors have played slight-of-hand games with units of measure and data rates for years – think of how hard drive manufacturers prefer to define a megabyte (1 million bytes) versus how the rest of the world define[d] a megabyte (2^20 bytes or 1,048,576 bytes).

It’s important that if we’re going to compare the speed of two different network technologies, we establish where we’re taking the measurement. Is it, as with 10GE, measured as bandwidth available at the MAC layer (in other words, after encoding overhead), or as I perhaps erroneously did with FC, measuring it at the physical layer (in other words, before encoding overhead). I also incorrectly stated, unequivocally, that 10GE used 64/66b encoding, when in fact 10GE can use 8b/10b, 64b/66b, or other encoding mechanisms – what’s important is not what is used at the physical layer, but rather what is available at the MAC layer.

In the case of 10GE, 10Gb/s is available at the MAC layer, regardless of the encoding mechanism, transceivers, etc used at the physical layer.

The Fibre Channel physical layer, on the other hand, sets its targets in terms of MB/s available to the Fibre Channel protocol (FC-2 and above). This is the logical equivalent of Ethernet’s MAC layer – after any encoding overhead. 1Gb Fibre Channel (hereafter FC), as the story goes, was designed to provide a usable data rate of 100 MB/s.

If we’re truly going to take an objective look at the two protocols and how much bandwidth they provide at MAC (or equivalent) and above, we have to pick one method and stick with it. Since the subject is storage-focus (and frankly, most of the objections come from storage folks), let’s agree to use the storage method – measuring in MB/s available to the protocol. As long as we use that measurement, any differences in encoding mechanism becomes moot.

So back to 1Gb/s FC, with it’s usable data rate of 100 MB/s. The underlying physical layer of 1Gb/s FC uses a 1.0625 Gb/s data rate, along with 8b/10b encoding.

Now, this is where most of the confusion and debate seems to have crept into the conversation. I’ve been attacked by a number of folks (not on this site) for suggesting that 1Gb FC has a 20% encoding overhead, dismissing it as long-standing FUD – created by whom and for what purpose, I’ve yet to discover. No matter how you slice it, a 1.0625 Gb/s physical layer using 8b/10b encoding results in 0.85 Gb/s available to the next layer – in this case, FC-2. Conveniently enough, as there are 8 bits in a byte, 100MB/s can be achieved over a link providing approximately 800Mb/s, or 0.8Gb/s.

Now, who doesn’t like nice round numbers? Who cares what the underlying physical layer is doing, as long as it meets your needs/requirements/targets at the next layer up?

If the goal is 100MB/s, 1Gb/s FC absolutely meets it. Does 1Gb/s FC have a 20% encoding overhead? Yes. Is that FUD? No. Do we care? Not really.

As each generation of FC was released, the same physical layer was multiplied, without changing the encoding mechanism. So 8Gb/s FC is eight times as fast as 1Gb/s FC. The math is pretty simple : ( 1.0625 * 8 ) * 0.8 = 6.8 Gb/s available to the next layer. Before my storage folks (by the way – my background is storage, not Ethernet) cry foul, let’s look at what 6.8 Gb/s provides in terms of MB/s. A quick check of Google Calculator tells me that 6.8 Gb/s is 870 MB/s – well over the 800 MB/s we’d need if we were looking to maintain the same target of 100MB/s per 1 Gb/s of link. So again, who cares that there’s a 20% encoding overhead? If you’re meeting your target, it doesn’t matter. Normalized per Gb/s, that’s about 108 MB/s for every Gb/s of link speed.

At this point, you’re probably thinking – if we don’t care, why are you writing this? Well, in a converged network, I don’t really care what the historical target was for a particular protocol or link speed. I care about what I can use.

Given my newly discovered understanding of 10Gb Ethernet, and how it provides 10 Gb/s to the MAC layer, you can already see the difference. At the MAC layer or equivalent, 10GE provides 10Gb/s, or 1,280MB/s. 8G FC provides 6.8Gb/s, or 870MB/s. For the Fibre Channel protocol, native FC requires no additional overhead, while FCoE does require that the native FC frame (2148 bytes, maximum) be encapsulated to traverse an Ethernet MAC layer. This creates a total frame size of 2188 bytes maximum, which is about a 2% overhead incurred by FCoE as compared to native FC. Assuming that the full bandwidth of a 10Gb Ethernet link was being used to carry Fibre Channel protocol, we’re looking at an effective bandwidth of (1280MB/s * .98) = 1254Mb/s. Normalized per Gb/s, that’s about 125 MB/s for every Gb/s of link speed.

The whole idea of FCoE was not to replace traditional FC. It was to provide a single network that can carry any kind of traffic – storage, application, etc, without needing to have protocol-specific adapters, cabling, switching, etc.

Given that VERY few servers will ever utilize 8Gb/s of Fibre Channel bandwidth (regardless of how or where you measure it), why on earth would you invest in that much bandwidth and the cables, HBAs, and switches to support it? Why wouldn’t you look for a solution where you have burst capabilities that meet (or in this case, exceed) any possible expectation you have, while providing flexibility to handle other protocols?

I don’t see traditional FC disappearing any time soon – but I do think its days are numbered at the access layer. Sure, there are niche server cases that will need lots of dedicated storage bandwidth, but the vast majority of servers will be better served by a flexible topology that provides better efficiencies in moving data around the data center. Even at the storage arrays themselves, why wouldn’t I use 10GE FCoE (1254 MB/s usable) instead of 8Gb FC (870 MB/s usable)?

Now, when 16Gb FC hits the market, it will be using 64/66b encoding. The odd thing, however, is that based on the data I’ve turned up from FCIA, it’s actually only going to be using a line-rate of 14.025 Gb/s, and after encoding overheads, etc, supplying 1600 MB/s usable (though my math shows it to be more like 1700 MB/s) – in keeping with the 1Gb/s = 100MB/s target that FC has maintained since inception.

Sometime after 16Gb FC is released, will come 40GE, followed by 32Gb FC, and again followed by 100GE. It’s clear that these technologies will continue to leapfrog each other for some time. My only question is, why would you continue to invest in a protocol-specific architecture, when you can instead have a flexible one? Even if you want the isolation of physically separate networks (and there’s still justification for that), why not use the one that’s demonstrably more efficient? FCoE hasn’t yet reached feature parity with FC – there’s no dispute there. It will, and when it does, I just can’t fathom keeping legacy FC around as a physical layer. The protocol is rock solid – I can’t see it disappearing the foreseeable future. The biggest benefits to FCoE come at the access layer, and we have all the features we need there today.

If you’d like to post a comment, all I ask is that you keep it professional. If you want to challenge my numbers, please, by all means do so – but please provide your math, references for your numbers, and make sure you compare both sides. Simply stating that one side or the other has characteristic X doesn’t help the discussion, nor does it help me or my readers learn if I’m in error.

Finally, for those who have asked (or wondered in silence) – I don’t work for Cisco, or any hardware manufacturer for that matter. My company is a consulting and educational organization focused on data center technologies. I don’t have any particular axe to grind with regards to protocols, vendors, or specific technologies. I blog about the things I find interesting, for the benefit of my colleagues, customers, and ultimately myself. Have a better mousetrap? Excellent. That’s not going to hurt my feelings one bit. 🙂

UCSM 1.4 : Direct attach appliance/storage ports!

One of the most often requested features in the early days of UCS was the ability to directly attach 10GE storage devices (both Ethernet and FCoE based) to the UCS Fabric Interconnects.

Up until UCSM 1.4, only two types of Ethernet port configurations existed in UCS – Server Ports (those connected to IO Modules in the chassis) and Uplink Ports (those connected to the upstream Ethernet switches). As UCS treated all Uplink ports equally, you could not in a supported manner connect an end device such as a storage array or server to those ports. There were, of course, clever customers who found ways to do it – but it wasn’t the “right” or most optimal way to do it.

Especially within the SMB market, many customers may not have existing 10G Ethernet infrastructures outside of UCS, or FC switches to connect storage to. For these customers, UCS could often provide a “data center in a box”, with the exception of storage connectivity. For Ethernet-based storage, all storage arrays had to be connected to some external Ethernet switch, while FC arrays had to be connected to a FC switch. Adding a 10G Ethernet or FC switch just for a few ports didn’t make a lot of financial sense, especially if those customers didn’t have any additional need for those devices beyond UCS.

With UCSM 1.4, all of that changes. Of course, the previous method of connecting to upstream Ethernet and FC switches still exists, and will still be the proper topology for many customers. Now, however, a new set of options has been opened.

Take a look at some of the new port types available in UCSM 1.4 :

New in 1.4 are the Appliance, FCoE Storage, Monitoring Ethernet, Monitoring FC, and Storage FC port types.

I’ll cover the Monitoring types in a later post.

On the Ethernet side of things, the Appliance and FCoE Storage allow for the direct connection of Ethernet storage devices to the Fabric Interconnects.

The Appliance port is intended for connecting Ethernet-based storage arrays (such as those serving iSCSI or NFS services) directly to the Fabric Interconnect. If you recall from previous posts, in the default deployment mode (Ethernet Host Virtualizer), UCS selected one Uplink port to accept all broadcast and multicast traffic from the upstream switches. By adding this Appliance port type, you can ensure that any port configured as an Appliance Port will not be selected to receive broadcast/multicast traffic from the Ethernet fabric, as well as providing the ability to configure VLAN support on the port independently of the other Uplink ports.

The FCoE Storage Port type provides similar functionality as the Appliance Port type, while extending FCoE protocol support beyond the Fabric Interconnect. Note that this is not intended for an FCoE connection to another FCF (FCoE Forwarder) such as a Nexus 5000. Only direct connection of FCoE storage devices (such as those produced by NetApp and EMC) are supported. When an Ethernet port is configured as an FCoE Storage Port, traffic is expected to arrive without a VLAN tag. The Ethernet headers will be stripped away and a VSAN tag will be added to the FC frame. Much as the previous FC port configuration was, only one VSAN is supported per FCoE Storage Port. Think of these ports like an Ethernet “access” port – the traffic is expected to arrive un-tagged, and the switching device (in this case, the Fabric Interconnect) will tag the frames with a VSAN to keep track of it internally. When the frames are eventually delivered to the destination (typically the CNA on the blade), the VSAN tag will be removed before delivery. Again, it’s very similar to traffic flowing through a traditional Ethernet switch, access port to access port. Even though both the sending and receiving devices are expecting un-tagged traffic, it’s still tagged internally within the switch while in transit.

The Storage FC Port type allows for the direct attachment of a FC storage device to one of the native FC ports on the Fabric Interconnect expansion modules. Like the FCoE Storage Port type, the FC frames arriving on these ports are expected to be un-tagged – so no connection to an MDS FC switch, etc. Each Storage FC Port is assigned a VSAN number to keep the traffic separated within the UCS Unified Fabric. When used in this way, the Fabric Interconnect is not providing any FC zoning configuration capabilities – all devices within a particular VSAN will be allowed, at least at the FC switching layer (FC2), to communicate with each other. The expectation is that the devices themselves, through techniques such as LUN Masking, etc, will provide the access control. This is acceptable for small implementations, but does not scale well for larger or more enterprise-like configurations. In those situations, an external FC switch should be used either for connectivity or to provide zoning information – the so-called “hybrid model”. I’ll cover the hybrid model in a later post.

Why doesn’t Cisco…?

I get asked a lot why Cisco doesn’t have feature X, or support hardware Y in their UCS product line. A recent discussion with a coworker reminded me that lots of those questions are out there, so I might as well give my opinion on them.

Disclaimer : I don’t work for Cisco, I don’t speak for Cisco, these are just my random musings about the various questions I hear.

Why doesn’t Cisco have non-Intel blades, like AMD or RISC-type architectures? Are they going to in the future?

As of today, Intel processors (the Xeon 5500/5600, 6500/7500 families) represent the core (pun intended) of the x86 processor market. Sure, even Intel has other lines (Atom, for one), and AMD still makes competitive processors, but most benchmarks and analysts (except for those employed by other vendors) agree that Intel is the current king. AMD has leapfrogged Intel in the past, and may do so again in the future, but for right now – Intel is where it’s at.

If you look at this from a cost-to-engineer perspective, it starts to make sense. It will cost Cisco just as much to develop an AMD-based blade as it does for the more popular and common Intel processors. Cisco may be losing business to customers that prefer AMD, but until they’ve run out of customers on the Intel side of things, it just doesn’t make financial sense to attack the AMD space as well.

As for RISC/Unix type architectures (really, any non-x86 platform), who’s chip would they use? HP? Not likely. IBM? Again, why support a competitor’s architecture – especially one as proprietary as IBM. (Side note – I’m a really big fan of IBM AIX systems, just not in the “blade” market) Roll their own? Why bother? It’s still a question of return on investment. Even if Cisco could convince customers to abandon their existing proprietary architectures for a Cisco proprietary processor, how much business do you really think they’d do? Nowhere near enough to justify the development cost.

Why doesn’t Cisco have Infiniband adapters for their blades? What about the rack-mount servers?

One of the key concepts in UCS is the unified fabric, using only Ethernet as the chassis-to-Fabric Interconnect topology. By eliminating protocol-specific cabling (Fibre Channel, Infiniband, etc), the overall complexity of the environment is reduced and the bandwidth is flexibly allocated between upper (above Ethernet) layer protocols. Instead of having separate cabling and modules for different protocols (a la legacy blade architectures), any protocol needed is encapsulated over Ethernet. Fibre Channel over Ethernet (FCoE) is the first such implemenatation in UCS, but certainly won’t be the last.

Infiniband as a protocol has a number of compelling features for certain applications, so I’d definitely see Cisco supporting RDMA over Converged Ethernet (RoCE) in the future. RoCE does for Infiniband what FCoE does for Fibre Channel. The underlying transport is replaced with Ethernet, while keeping the protocol intact. Proponents of Infiniband will point to the transport’s legendary latency characteristics, specifically low and predictable. The UCS unified fabric architecture provides just such an environment – low, predictable latency that’s consistent in both inter- and intra-chassis applications.

As for the rack-mount servers, there’s nothing stopping customers from purchasing and installing their own PCI Infiniband adapters. Cisco isn’t producing one, and won’t directly support it – but rather treats it as a 3rd party device to be supported by that manufacturer.

What about embedded hypervisors?

Another key feature of UCS is that the blades themselves are stateless, at least in theory. No identity (MACs, WWNs, UUIDs, etc), no personality (boot order, BIOS configuration) until one is assigned by the management architecture. Were the blades to have an embedded hypervisor, that statelessness is lost. Even though it’s potentially a very small amount of stateful data (IP address, etc), it’s still there. This is probably the most-likely to be supported question in my list. My expectation is that at some point in the future, the UCS Manager will be able to “push” an embedded hypervisor, along with its configuration, to the blade along with the service profile. By making UCS Manager the true stateful owner of the configuration data, having a “working copy” on the blade becomes less of an issue.

Final thoughts…

I’ve used this analogy in the past, so I’ll repeat it here. I look at UCS as sort of the Macintosh of the server world. It’s a closely controlled set of hardware in order to provide the best possible user experience, at the cost of not supporting some edge-case configurations or feature sets. No, you can’t have Infiniband, or GPUs on the blade, or embedded hypervisors. The fact is that the majority of data center workloads don’t need these features. If you need those features, there are plenty of vendors that provide them. If you want a single vendor for all your servers – regardless of edge-case requirements – there are certainly vendors that provide that (HP, IBM, etc). In my opinion, though, it’s that breadth of those product offering that makes those solutions less attractive. In accommodating for every possible use case, you end up with a very complex architecture. Cisco UCS is streamlined to provide the best possible experience for the bulk of data center workloads. Cisco doesn’t need to be, or want to be as near as I can tell, an “everything to everybody” solution. Pick something you can do really, really well and do it better than anyone else. Let the “other guys” work on the edge cases. Yes – that will cost Cisco some business. Believe it or not, despite what the rhetoric on Twitter would have you believe, there’s enough business out there for all of these server vendors. Cisco, even if they’re wildly successful in replacing legacy servers with UCS, isn’t going to run HP or IBM or Dell out of business. They don’t need to. They can make a lot of money, and make a lot of customers very happy, co-existing in the marketplace with these vendors. Cisco provides yet another choice. If it doesn’t meet your needs, don’t buy it. 🙂

No offense or disrespect is intended to my HP and IBM colleagues. You guys make cool gear too, you’re just solving the problems in a different way. Which way is “best”? Well, now, that really comes down to the specific customer doesn’t it?

8Gb Fibre Channel or 10Gb Ethernet w/ FCoE?

Update 2011/01/31 – I’ve added new thoughts and comments on the subject here: http://www.unifiedcomputingblog.com/?p=234

Which is better? Which is faster?

I’ve been stuck on this one for a while. I’m traditionally a pure fibre channel kind of guy, so I’ve been pretty convinced that traditional FC was here to stay for a while, and that FCoE – as much as I believe in the technology – would probably be limited to the access and aggregation layers for the near term. That is, until it was pointed out to me the encoding mechanisms used by these two technologies and the effective data rates they allowed. I’m not sure why it never occurred to me before, but it hit me like the proverbial ton of bricks this week.

First off, a quick review of encoding mechanisms. Any time we’re transmitting or storing data, we encode it in some form or another. Generally, this is to include some type of a checksum to ensure that we can detect errors in reading or receiving the data. I remember the good old days when I discovered that RLL hard drive encoding was 50% more efficient than MFM encoding, and with just a new controller and a low level format, my old 10MB (yes, that’s ten whopping megabytes, kids. Ask your parents what a megabyte was.) suddenly became 15MB! Well, we’re about to embark on a similar discovery.

1, 2, 4, and 8 Gb Fibre Channel all use 8b/10b encoding. Meaning, 8 bits of data gets encoded into 10 bits of transmitted information – the two bits are used for data integrity. Well, if the link is 8Gb, how much do we actually get to use for data – given that 2 out of every 10 bits aren’t “user” data? FC link speeds are somewhat of an anomaly, given that they’re actually faster than the stated link speed would suggest. Original 1Gb FC is actually 1.0625Gb/s, and each generation has kept this standard and multiplied it. 8Gb FC would be 8×1.0625, or actual bandwidth of 8.5Gb/s. 8.5*.80 = 6.8. 6.8Gb of usable bandwidth on an 8Gb FC link.

10GE (and 10G FC, for that matter) uses 64b/66b encoding. For every 64 bits of data, only 2 bits are used for integrity checks. While theoretically this lowers the overall protection of the data, and increases the amount of data discarded in case of failure, that actual number of data units that are discarded due to failing serialization/deserialization is minuscule. For a 10Gb link using 64b/66b encoding, that leaves 96.96% of the bandwidth for user data, or 9.7Gb/s.

So 8Gb FC = 6.8Gb usable, while 10Gb Ethernet = 9.7Gb usable. Even if I was able to use all of the bandwidth available on an 8Gb FC port (which is very unlikely at the server access layer), with 10GE running FCoE, I’d still have room for 3 gigabit Ethernet-class “lanes”. How’s that for consolidation?

10Gb FC has the same usable bandwidth, and without the overhead (albeit a small 2% or so) of FCoE, but you don’t get the consolidation benefits of using the same physical link for your storage and traditional Ethernet traffic.

I’m sold.

UCS Books

I often get asked by students if there are any other resources (beyond those I’m linking to) to get more information on UCS.

Silvano Gai, along with Tommi Salli and Roger Andersson, published a book back in March of 2009 that includes a great history of the drivers and goals behind the design of UCS. While some of the material is, of course, outdated by now – and much of the terminology has changed and evolved, it still represents an excellent view into how the architecture was conceived and designed. Definitely recommended if you’re planning on doing anything with UCS.

You can pick up Project California: a Data Center Virtualization Server – UCS (Unified Computing System) at Amazon.

While you’re at it, also check out Silvano’s book IO Consolidation in the Data Center (namely Fibre Channel over Ethernet – FCoE).