Monday, June 10, 2013

100 Gigabit Ethernet at Penn

This summer, the University of Pennsylvania is upgrading its campus core routing equipment (in fact we're in the midst of this upgrade right now). This is basically an upgrade to the set of large routers that form the center of our network.

The current core topology consists of 5 core routers (and also 2 border routers) interconnected by two independent layer-2 switched 10 Gigabit Ethernet networks. Each of the core routers is located in one of five geographically distributed machine rooms across the campus. A rough diagram is shown below.

This diagram also shows the current external connections to/from the campus network - we have three links (each 10 Gigabit Ethernet) to Internet Service Providers (ISPs). And two connections to MAGPI (the regional Internet2 GigaPoP operated by us), via which we access a 10 Gigabit Ethernet connection to Internet2. The Internet2 connection is shared amongst Penn and other MAGPI customers, which are mostly Research & Education institutions in the geographic area.

The core interconnect is being upgraded to 100 Gigabit Ethernet (a ten-fold increase in link bandwidth). It would be cost prohibitive to fully replicate the current design in 100 Gig (since this equipment is still very expensive) so the interconnect design has been adjusted a bit. Instead of two layer-2 switch fabrics interconnecting the routers, we are deploying the core routers connected in a 100 Gig ring (see the diagram below). When the final design is fully implemented, each core router will have a 10 Gig connection into each of the border routers (this will require some additional upgrades to the border routers, which are expected to happen later this year). The topology redesign has fewer links, and in the final count (summing the bandwidth of all the core facing links), the new core will have about 5 times the aggregate bandwidth of the old one. The maximum (shortest path) edge to edge diameter of the network increases by one routing hop.

100 Gigabit Ethernet is today's state of the art in transmission speed. The next jump up will likely be 400 Gigabit Ethernet for which the IEEE already has a study group launched and several preliminary designs under consideration.

Not depicted in this diagram is the rest of the network towards the end systems. Below the layer of core routers, are smaller routers located at the 200 or so buildings scattered around campus. Each building router is connected to two of the core routers. The building routers feed wiring closets inside the building, which house layer-2 switches that network wallplates are connected to.

In the process of the upgrade, we are also changing router vendors. The current core routers, Cisco 7609 series routers with Sup7203BXL supervisor engines, have served us well. They were originally deployed in the summer of 2005, and have been in operation well past their expected lifetime.

As is our practice, we issued an RFI/P (Request for Information/Purchase) detailing our technical requirements for the next generation routers and solicited responses from the usual suspects, selecting a few vendors whose equipment we bring in for lab testing, followed by a selection.

The product we've selected is the Brocade MLXe series router, specifically the MLXe-16 - this router can support 16 half height, or 8 full height (or a mixture of full/half) line cards, as well as redundant management and switch fabric modules.

A product description of the MLXe series is available at:

The photo below is one of the routers (prior to deployment) in the Vagelos node room (one of 5 machine rooms distributed around campus where we house critical networking equipment and servers).Going from left to right, this chassis has two management modules, one 2-port 100 Gigabit Ethernet card, six 8-port 10 Gigabit Ethernet cards, four switch fabric modules, two more 8-port 10 Gigabit Ethernet cards, and three 24-port Gigabit Ethernet cards.

One of these routers was deployed in production last week. The rest should be up and running by the end of this month or by early July.

(The full set of photos can be seen here on Google Plus)

Shown below is the 2-port 100 Gigabit Ethernet card, partially inserted into the chassis, showing the CFP optical transceiver modules attached.

Unlike preceding generations of ethernet, with 100 Gigabit Ethernet, the transmission technology uses multiple wavelengths in parallel (although there are parallel fiber implementations also). The current IEEE specifications (802.3ba) specify four lanes of 25Gbps. However a number of key vendors in the industry, including Brocade, formed the MSA (Multi Source Agreement) and designed and built a 10x10 (10 lanes of 10Gbps) mechanism of doing 100 Gig, at much lower cost than 4x25Gbps, operating over single mode fiber at distances of 2, 4, or 10km. This is called LR-10 and uses the CFP (C Form factor pluggable) media type.

Pictured below (left) is a Brocade LR10 100 Gigabit Ethernet CFP optical module installed in 100 Gig line card with a single mode fiber connection (LC). On the right is an LR10 CFP module taken out of the router.

Close up of the 8-port 10 Gigabit Ethernet module, and several 24-port Gigabit Ethernet modules. To connect cables to them, we need to install small form factor pluggable transceivers into them, SFP+ for the 10 gig, and SFP for the 1 gig.

Pictured below is one of the five Cisco 7609 routers that will be replaced.

One of the Penn campus border routers, a Juniper M120, is shown below. This is also scheduled to be upgraded in the near future to accommodate 100 Gig and higher density 10 Gig, although the product has not yet been selected/finalized.

Below: A Ciena dense wavelength division multiplexer (DWDM). Penn uses leased metropolitan fiber to reach equipment and carriers in 401 North Broad street, the major carrier hotel in downtown Philadelphia. We have DWDM equipment placed both at the campus and the carrier hotel to carry a mixture of 1 and 10 Gig circuits and connections across this fiber for various purposes. This equipment is also scheduled to be upgraded to allow us to provision 100 Gigabit Ethernet wavelengths between the campus and the carrier hotel.

High Performance Networking for Researchers

Penn is a participant in the National Science Foundation (NSF) funded DYNES (Dynamic Network Systems) project, which provides high bandwidth dedicated point to point circuits between (typically) research labs for specialized applications. Popular uses of this infrastructure today include high energy physics researchers obtaining data from the LHC and other particle accelerator labs, and various NSF GENI network research projects.

Earlier this year, we completed a grant application for the  NSF "Campus Cyberinfrastructure - Network Infrastructure and Engineering (CC-NIE)" program. I spent a large amount of time in March of this year with several Penn colleagues in preparing the application. If Penn does win an award (we'll find out later this year), we will be deploying additional dedicated network infrastructure for campus researchers, bypassing the campus core and with 100 Gbps connectivity out to the Internet2 R&E network. A rough diagram of how this will look is below.

Software Defined Networking

There's a huge amount of buzz about Software Defined Networking (SDN) in the networking industry today, and a number of universities are investigating SDN enabled equipment for deployment in their networks. Of the big router vendors, Brocade does appear to have one of the better SDN/openflow stories thus far. The MLXe series already supports an early version of Openflow (the portion of SDN that allows forwarding tables of switches/routers to be programmed by an external SDN controller).

Penn is building an SDN testbed in our network engineering lab, primarily to investigate its capabilities. For us, SDN is still largely a solution in search of a problem. We run a very simple network by design, whose primary purpose is connectivity and high performance packet delivery. The most probable use case in our future, virtualization of the network, is likely better achieved with a proven technology like MPLS first. But we'll keep an eye on SDN and its evolution. We do want to support research uses of SDN though. Several faculty members in the Computer Science department are interested in SDN, and the NSF CC-NIE grant will allow us to build some SDN enabled network infrastructure separate from the core production network to accommodate their work.

-- Shumon Huque