Achieving carrier-grade software defined networks
By Geoff Bennett, of Infinera
Writing anything about software defined networking (SDN) is guaranteed to get attention these days. It’s a hot topic – not least because it really does offer a way to remove the domination of the Ethernet switch and core router markets by the established players and open up new possibilities for network functions virtualisation for network operators.
What is SDN?
The conventional (pre-SDN) approach to building networks is to interconnect switches or routers that will forward packets to their destination through the network infrastructure plane. The infrastructure plane pathways are set up using distributed control plane protocols, such as OSPF in the IP world, or the GMPLS family of protocols in the transport network world. Historically the philosophy for distributing control plane functions has been to make the network more scalable (by dividing up the work), and more autonomous in the face of network failures because each node knows enough about the network to route traffic around failed links.
The problem with a distributed approach is that these control planes need processing power to run them, and are normally implemented inside of a proprietary vendor operating system so that users cannot implement new features or fix bugs by themselves. If the network is made up of multiple technology types (eg. IP, OTN, DWDM), and multiple vendors, it may be difficult to use a single control plane to provision end to end connections through all of the layers.
Enter the idea of the SDN, a simple view of which is shown in Figure 1. In this architecture the control plane is removed from the network elements themselves, and is located in a “controller”. A protocol is implemented between the controller and the network elements, which exposes an API to the controller. OpenFlow is one example of a 'controller to data plane interface' (CDPI) that can be used to create an SDN; but more recent examples include the IETF NETCONF protocol that uses a YANG data model (and is often referred to as NETCONF/YANG).
The other key feature of SDN is that it’s intended to be 'open'. The controller itself should implement an open API, this time called the Northbound Interface (NBI), to allow anybody approved for access to the controller to create and maintain network applications. Examples of such applications could include a multi-layer topology abstraction app, a multi-layer path optimisation app, and a multi-layer provisioning app, also shown on Figure 1. The controller and its associated SDN applications are made scalable using Web 2.0 techniques for network functions virtualisation (NFV). So the controller is no longer just a box, but is a function of the network that can itself be virtualised for scalability and resilience. I’ve shown this in simple terms as multiple boxes, but in theory the controller could be distributed over multiple data centres.
Why use SDN?
Most people who are following the topic are aware that SDN architectures are now commercially viable for university research networks (where SDN was born), for enterprise networks, and for cloud data centres. Each of these application areas have different motivations for using a centralised SDN architecture. For example: university researchers love the open aspect of the switches to allow them to do research; enterprise users love the fact that they can buy low cost white label switches that allow them to break free of the price premiums charged by big brand vendors; and data centre operators like the deterministic way that they can manage data flows and over-ride network control plane operations to help them achieve NFV.
More recently, service providers have been looking at a 'carrier SDN' concept, and one of the main drivers behind this is to allow them to provision services through the multiple technology and vendor layers that may comprise a typical transport network. As I mentioned above, this can be difficult to achieve with the current instances of distributed control planes.
For the packet layers – such as Ethernet, IP and MPLS – there are no substantial technology barriers to an SDN architecture. Likewise for the OTN digital bandwidth management layer that routers typically plug into, there is no real barrier to dynamic service creation.
But in the DWDM world there’s a problem. In Figure 2 I show how traffic from the upper two pairs of routers is carried over two 100G transponders over the long distance fibre (shown by the solid red and green lines). A third router is installed… but the dotted line is where the third 100G transponder will be – once it is installed by the service engineers. This is the fundamental barrier to making the optical layer 'programmable'. Capacity is not available until an engineer goes out in a truck and installs it.
This problem was highlighted by Neil McRae, BT’s chief network architect, at the recent WDM & Next Generation Networks conference in Nice, France. He pointed out that 'a brick that can’t be programmed is just a brick'. In this context, if the underlying data plane resists automation for any reason then it makes the job of a control plane extremely challenging. In this case and optical layer built on conventional transponders is 'just a brick' – and this probably explains why most DWDM vendors have avoided automating this part of their architecture to date.
Fortunately there is a DWDM technology that solves this problem. DWDM super-channels were first shipped in mid 2012, and have rapidly become a favored approach for deploying capacity at the 'beyond 100G' level in the Western world.
A super-channel is an evolution of DWDM technology that implements multiple, coherent optical wavelengths on a single linecard, and which is brought into service in a single operational cycle. So all of the wavelength planning associated with multiple 100G transponder installation is done at one time, and the wavelengths are ready for new service activation, or to protect existing services in the event of link failures.
Current super-channels have been deployed at the 500Gb/s data rate, but what if the service provider doesn’t need 500Gb/s of capacity on day one? In this case the super-channel can be 'sold' in units of 100Gb/s – a feature called Instant Bandwidth because the “dormant” 100G sub-channels can be activated in seconds through the network management system.
Referring back to Figure 2b, the dotted lines between the super-channel line card represent the 'virtual capacity' of the super-channel that has not yet been activated, and is not being paid for by the service provider. When the new routers comes along, or if the network load has to be rebalanced in the event of a link failure elsewhere in the network, these chunks of virtual capacity can be activated and made available in seconds. Note that Figure 2 is a very simplistic example of the power of this technique. In a larger, meshed network the possibility for closely matching rapid response to demands with cashflow efficiency are extremely exciting.
The first service provider to publicly support instant bandwidth was TeliaSonera International Carrier. Mattias Fridstrom, the company’s CTO, also presented in Nice and it was reassuring to hear a service provider talking about how these kinds of innovations are driving new revenue streams, and not simply about ways to drive cost out of the network!
Instant bandwidth is one of three elements needed to make the optical data plane programmable. The second is to recognise that OTN and DWDM belong together – in the same network element. This is a prime example of how it makes sense to integrate certain functional elements in the network. According to a survey by analyst firm Infonetics, 90 per cent of the service providers they questioned are deploying, or plan to deploy an integrated OTN/DWDM core transport platform. The addition of OTN means that DWDM capacity, and the switching of that capacity, is now totally deterministic. This contrasts with the idea of only using optical switching in the network. Optical switching is extremely cost-effective, but when an optical switching decision is requested by the control plane it may be seconds or even minutes before the capacity becomes ready for service. Even worse is that it may not be possible at all to complete the switching decision if the optical switch (called a Reconfigurable Optical Add/Drop Multiplexer, or ROADM) encounters blocking for one of many possible reasons (eg. wavelength blocking).
In fact the third element to ensure a programmable optical data plane is that ROADMs have to be engineered to be free of any risk of blocking – a design known as colourless, directionless and contentionless (CDC). CDC ROADMs that are able to manage flexible grid super-channel wavebands are now becoming available, and the technology is a perfect complement to Instant Bandwidth and non-blocking OTN switching.
In summary, all the elements are now in place so that the optical layer in modern Transport Network can be programmed by a carrier grade control plane – whether that’s generalised MPLS (which is widely used today) or carrier SDN in the future. The end result is that service providers can now become more responsive to new service demands, or changing traffic patterns while keeping OpEx costs low.