Networking: Time to Change the Dialog
I am off to NYC to present at an SDN gathering hosted by Oktay Technology. I am going to change up my standard pitch deck, so I am curious to see the reaction. I have decided that I have been too nice and I plan to be more provocative and change the network dialog from speeds, feeds, ports and CLIs to a discussion about the network as a system and orchestrating the network from the applications down – opposed to the bottom up wires approach.
I decided to take a more provocative approach to my messaging after presenting with Terry Slattery at a C-MUG event last week. The presentations can be found here. The net result of my presentation was a longer post for the Plexxi blog, which can be found here. I also reproduced that post below for ease of reading.
January has been a fast start in terms of SDN news. Juniper unveiled their SDN story a few weeks ago. I was pleased to see Plexxi made their big board (see pic). Reminds me of that funny scene in Dr. Strangelove. Cisco briefed Lightreading on their upcoming SDN announcements. The most startling data point to me was that Cisco has 710 APIs. That is a wow. My first reaction was that was 709 too many, but then again they have thousands of products and somewhere around a ~$181B installed product base so maybe 710 is the right number for Cisco. I also did some poking around other blogs looking to see what people were working on to start the year. I noted the last time Andy blogged at Arista was July 2011.
Side note on Plexxi…we have secured office space in downtown San Francisco. I am planning to be in that office monthly and it should serve us well for events like VMWorld. I think the marketing team want to host a party.
Below the signature is the post I wrote for Plexxi…enjoy.
/wrk
Traffic Patterns and Affinities
I had an opportunity to present to a group of networking professionals in Columbia Maryland last week. The topic of the presentation was…wait for it…LAN emulation or LANE. Someone thought that was funny, but the real topic was of course software defined networking. A link to presentation can be found here. This was not a Plexxi specific presentation, but rather I was to present how SDN would change the network and what is the problem set that SDN is trying to solve.
Honored to have been asked by Terry Slattery to present our view of SDN, I went through thirty two slides outlining how the network has failed to change over the past decade, yet every other major constitute in the data center has changed. At one point I said to the audience that if person had left a career in the data center in 2001 and came back to work today, every part of the modern data center would be unrecognizable except the network. The network would most likely be running the same systems, in the same design as the day that person left in 2001.
On the way to the airport after the presentation, it occurred to me I could have done better at presenting one slide. It then occurred to that I could have done the entire presentation with just one slide. Here is that one slide and my attempt to discuss SDN in one slide.
The graph below comes from a 2009 paper entitled The Nature of Datacenter Traffic: Measurements & Analysis by Srikanth Kandula, Sudipta Sengupta, Albert Greenberg, Parveen Patel, Ronnie Chaiken of Microsoft. Here is the paper summary from the authors:
We explore the nature of traffic in data centers, designed to sup- port the mining of massive data sets. We instrument the servers to collect socket-level logs, with negligible performance impact. In a 1,500 server operational cluster, we thus amass roughly a petabyte of measurements over two months, from which we obtain and report detailed views of traffic and congestion conditions and patterns. We further consider whether traffic matrices in the cluster might be obtained instead via tomographic inference from coarser-grained counter data.
There are a number of interesting observations within the paper, but I want to start with the picture that they described as “The Work-Seeks-Bandwidth and Scatter-Gather patterns in datacenter traffic as seen in a matrix of loge (Bytes) exchanged between server pairs in a representative 10S period.”
For the presentation to the networking group, I used this picture to explain the concept of Affinities, but I realized after the presentation that I could have used the picture to explain much more. If a picture is worth a thousand words, here are the words.
What is the most striking feature of this picture? Anyone? Bueller, Bueller? It is 93% empty. If we were to reverse the color pattern it would look like this:
This picture embodies the elements of network design that have manifested and grown within our craft. We have become the masters of complexity. We build networks to solve the any-to-any problem because it has always been done that way – yet we live in a world in which everything does not need connectivity to everything. This type of network design is deeply rooted in our historical network design practices. It is the design principle of legend and lore. We configure, test and measure network performance at the port level. Why do we do this? We do it because that is how we have always built the network from the port level up. We want to think of the network as a system, but we are not equipped with the tools to provide us with meaningful data to manage the network as a system.
There are so many interesting observations to make about the picture. Let us start with the concentration of data traffic along the ToR access ports (70 racks, 1,500 servers), which is the hypotenuse of the graph. As any network architect will tell you, these are the cheapest network ports you can purchase. As you move away from the hypotenuse, there is the formation of a box in the traffic pattern. This is the spine or aggregation layer in a traditional leaf/spine network design. As you push toward the edges of the graph, this is the core switching ports purchased to interconnect the network hierarchy. You should note that the core switching ports, which are the most expensive to purchase are the least used ports.
The next observation is around the concept of Affinities. Affinity networking is term we use at Plexxi to express that some sort of relationship exists between compute elements that use the network. We believe that servers talk to servers and they do this because of the nature of the applications that use the network. Look at both pictures. Do you see the Affinities? Let me highlight the Affinities:
I am often asked to explain the concept of Affinities in nearly every meeting in which we are presenting Plexxi for the first time. The truth is that network professionals, architects, practitioners have been using and deploying Affinity constructed networks all of our professional careers. We did it on the first day and we are still doing it today. The challenge we have is we think about Affinities on the first day when we design the network and then never again because we have a limited, or no tool set with which to make Affinities useful in the network after we deploy the network. Constructing the network around Affinities on day 1 is easy, but making the network continuously Affinity driven is completely different problem set.
We do not think about Affinities after the network is deployed, because we design the network once and then we cast the network in concrete (i.e. fixed wires/cabling). Using the traffic pattern graph from above, our typical design criteria is to buy as much core interconnect (i.e. capacity) that we can afford and then spread that capacity around the design to ensure that we have achieved any-to-any connectivity. When you hear the phrase “non-blocking, any-to-any fabric” that is really code words for “we have no idea what kind of traffic patterns we need to support so we are going to design for Mother’s Day traffic load.” That was cultural reference to over-subscription in circuit-switching networks for those who have never used a POTS line.
What we all fail to acknowledge, yet we all know is that everything does not need to be connected with everything and there are pockets, groups, clusters of devices (i.e. compute, storage, users, applications) that need a concentration of network capacity. How nice would it be to take some of the 93% of the excess capacity and apply it to the Affinities in the network?
Here is another example from a recent customer call. I made an incorrect assumption the other day speaking to a CTO about this network. I assumed he optimized the network for his daily users. He told me that assumption was incorrect. He built his network for backup and replication workloads that occur at night. He said his network is under utilized during the day. What he really wanted was a diurnal network that can be programmed to meet the needs of his users during the day and then when that capacity is used have that capacity orchestrated towards the needs of his backup and replication requirements. He cannot do that today because his network is fixed.
The reason we do not practice Affinity networking the day after the network is deployed is we have not had the tools (i.e. software) and devices (i.e. switches) to make the network scaleable, reproducible, dynamic, and orchestrateable, in a word useful based on Affinities. You have probably heard of the problems associated with the inability to orchestrate the network based on Affinities, but they were described to you using a different set of words. You most likely heard of the problem described in terms such as:
- Poor programmatic control
- Inflexible workload placement
- Limited multi-tenancy
- No easy scale out model
- Device by device management
- Workload or data center fragmentation
These are the challenges that SDN will prove to be a means in which to provide a solution.
/wrk
Pingback: SDN, It’s Free just like a Puppy! | SIWDT
Pingback: Self Similar Nature of Ethernet Traffic | SIWDT
Pingback: Do It Again Part 3: It is Easy to Believe in Self-Referential Data Sets | SIWDT