UCS with disjointed L2 Domains

How do we deal with disjointed L2 domains in UCS?

To start, what’s a disjointed L2 domain?  This is where you have two Ethernet “clouds” that never connect, but must be accessed by the same UCS Fabric Interconnect.   Take, for example, a multi-tenant scenario where we have multiple customer’s servers within the same UCS cluster that must access different L2 domains.

How do we ensure that all traffic from Customer A’s blade only goes to their cloud, while Customer B’s blades only connect to their cloud?

The immediately obvious answer is to use UCS pin groups to tie each customers interfaces (through their vNIC configuration) to the uplinks that go to their cloud.   Unfortunately, this only solves half of the problem.

In the default operational mode of the Fabric Interconnects (called Ethernet Host Virtualizer, sometimes called End Host Virtualizer), only one uplink is used to receive multicast or broadcast traffic.   EHV mode assumes a single L2 fabric on the uplinks (VLAN considerations notwithstanding).  So in this example, only broadcasts or multicasts from one of the two fabrics would be accepted.   Obviously, this is a problem.

The only way to get around this is to put the Fabric Interconnects into Ethernet Switching mode.   This causes the Fabric Interconnect to behave as a standard L2 switch, including spanning tree considerations.  Now uplinks can receive broadcasts and multicasts regardless of the fabrics they are connected to.   This does, however, increase the administrative overhead of the Fabric Interconnects and reduces your flexibility in uplink configuration since now we must channel all ports going into the same L2 domain in order to use the bandwidth.

To me, a more ideal situation would be to leave the Fabric Interconnects in EHV mode, and use another L2 switch to perform the split between fabrics, such as the following:

This configuration allows the Fabric Interconnect to remain in EHV mode and has the upstream L2 switches performing the split between the L2 domains.  ACLs can be configured on the L2 switches as necessary to isolate the networks, something that cannot be done on the Fabric Interconnect regardless of mode.

Both of these scenarios assume that each of the two customer L2 clouds are using different VLAN numbering, since there’s no capacity in UCS to distinguish between the same VLAN numbers on either Fabric.   There are certainly L3 and other translation tricks that you could use to accomodate this, but that’s an entirely different post.  🙂

Panduit Completes 7 Meter SFP+ Copper Cables!

Panduit is now shipping their 7 meter SFP+ copper twinax cables.   They’re not officially on the Cisco compatibility list yet, but once they are, this opens up some additional UCS expansion options.   The jump from 5 meter to 7 meter may not seem like a lot, but that’s another rack and a couple of more chassis… who couldn’t use another 32 or 48 blades while still keeping the cabling infrastructure cheap?

My understanding is that the cables work just fine in UCS and Nexus 5000 configurations, but aren’t yet officially supported by Cisco.

MAC forwarding table aging on UCS 6100 Fabric Interconnects

I was recently forwarded some information on the MAC table aging process in the UCS 6100 Fabric Interconnects that I thought was very valuable to share.

Prior to this information, I was under the impression (and various documentation had confirmed) that the Fabric Interconnect never ages MAC addresses – in other words, it understands where all the MAC addresses are within the chassis/blades, and therefore has no need to age-out addresses.   In the preferred Ethernet Host Virtualizer mode, it also doesn’t learn any addresses from the uplinks, again, so no need to age a MAC address.

So what about VMware and the virtual MAC addresses that live behind the physical NICs on the blades?

Well, as it turns out, the Fabric Interconnects do age addresses, just not those assigned by UCS Manager to a physical NIC (or a vNIC on a Virtual Interface Card – aka Palo).

On UCS code releases prior to 1.1, learned address age out in 7200 seconds (120 minutes) and is not configurable.

On UCS code releases of 1.1 and later, learned addresses age out in 7200 seconds (120 minutes) by default, but can be adjusted in the LAN Uplinks Manager within UCS Manager.

Why do we care?   Well, it’s possible that if a VM (from which we’ve learned an address) has gone silent for whatever reason, we may end up purging it’s address from the forwarding table after 120 minutes… which will mean it’s unreachable from the outside world, since we’ll drop any frame that arrives on an uplink to an unknown unicast MAC address.   Only if the VM generates some outbound traffic will we re-learn the address and be able to accept traffic on the uplinks for it.

So if you have silent VMs and have trouble reaching them from the outside world, you’ll want to upgrade to the latest UCS code release and adjust the MAC aging timeout to something very high (or possibly never).

What’s the difference between VN-Link and VN-Tag?

This is a question that constantly comes up in the classes and discussions that I’m involved in.

Part of the problem is that Cisco’s own documentation tends towards the marketing and less towards the technology.  Because of that, even a lot of Cisco folks are confused on exactly what is meant by each term.  Hopefully, this short summary will help.

VN-Link is a marketing umbrella term used by Cisco to describe any number of approaches to providing physical network type visibility to non-physical or non-directly attached devices.  This could mean virtual machines, could mean virtual interfaces on a remote interface card (a la Cisco’s Virtual Interface Card – aka Palo), or could mean physical interfaces on a non-switching remote device such as the Nexus 2000-series devices.

Probably the easiest way to group these is by the existence of a “Virtual Ethernet Port” on a switching device that is controlled like a physical port, but doesn’t directly map to a local physical switch port.

There are two approaches currently in use that fall under the VN-Link umbrella.

The Cisco Nexus 1000V switch, which is a software-only Cisco-branded switch that rides on top of VMware’s vNetwork Distributed Switch (DVS), is considered VN-Link because it provides virtual machine level visibility and granularity in network configuration and control.  Each virtual machine receives a “virtual Ethernet port” on the 1000V, which can be configured and controlled just like a physical Ethernet port would on a standard switch.

The Nexus 5000/2000 combination and the Cisco UCS Fabric Interconnect/IO Module combination both use an additional header in Ethernet frames called VN-Tag, which uniquely identifies some remote port which will receive a virtual Ethernet port on the local switch (50xx or 61xx).  This causes the Nexus 2000 or UCS IO Module to act as a remote line card to the host device, and doesn’t have to be managed individually.   All switching happens in the host device (50xx or 61xx).  The same VN-Tag technology is used by the Cisco Virtual Interface Card (VIC, or Palo) to identify the virtual interfaces being supported by the card.  With this tag added into the Ethernet frame, the host device (50xx or 61xx) can uniquely identify the source port (virtual or physical) and apply policy or configuration to it.

Note that the Nexus 1000V does not perform VN-Tag’ing – it simply doesn’t need to in order to meet it’s objective of providing VM-level visibility and control.

So both of these approaches meet the same architectural goal, while doing so with very different technologies.  Even so, they both fall under the same VN-Link umbrella.  Don’t confuse VN-Link, the “goal”, with the implementation.

A real deployment’s experience with adding hardware to UCS

One of the things we talk about in Cisco UCS is how easy is it is to expand the architecture.   Each time you add a chassis in competing solutions, you have lots of management points to configure – the chassis itself, the Ethernet and potentially Fibre Channel switching/connectivity, etc.   With UCS, it’s simply a matter of “telling” the Fabric Interconnects that the chassis is there and the rest happens auto-magically.  At HealthITGuy’s Blog, Michael Heil posts his experience in adding a new chassis to a running UCS deployment.  The rest of his posts on UCS, from a customer’s perspective, are also excellent.