It is all about Doctrine (I am talking about Networking and that SDN thing)
Last year, I wrote a long post on doctrine. I was reminded of that post three times this week. The first was from a Plexxi sales team who was telling me about a potential customer who was going to build a traditional switched hieracrhical network as test bed for SDN. When I asked why they were going to do that, they said the customer stated because it was the only method (i.e. doctrine) his people had knowledge of and it was just easier to do what they have always done. The second occurrence was in a Twitter dialog with a group of industry colleagues across multiple companies (some competitors!) and one of the participants referenced doctrine as a means for incumbents to lock out competitors from markets. The third instance occurred at the at the Credit Suisse Next-Generation Datacenter Conference when I was asked what will cause people to build networks differently. Here are my thoughts on SDN, networking, doctrine, OPEX and building better networks.
Is SDN a killer app for networking? No, but applications are.
Why is networking so complex? Because it is a method (i.e. strategy) for keeping competition out of your market. (1) Step 1 make it complex, (2) Step 2 make it require advanced training, (3) Step 3 insitiutionalize advanced training to master complex knowledge (4) Step 4 induce a state of apogee via doctrine to prevent change.
How is doctrine enforced and institutionalized? By reinforcing the need for complexity. If you cannot make it complex on your own, find others to help.
“The concept of doctrine enables technology people to make assumptions. Assumptions are great as long as they hold. When I refer to doctrine I am referring to procedures that ecosystem participants follow because they have been trained to reason and act in a certain manner within the command and control structure of their business and technology. We design networks, manage companies; evaluate technology and markets according to a common set of doctrines that have been infused into the technology ecosystem culture over many decades. I was thinking along this thought line in mid-October when I posted “I also believe we are all susceptible to diminished breadth in our creativity as we get older. Diminished breadth in our creativity the root cause as to why history repeats itself and another reason why when we change companies we tend to take content and processes from our prior company and port them to our new company. This is especially true in the technology industry. We recycle people; hence we recycle ideas, content and value propositions from what worked before. Why be creative when it is easier to cut and paste? As a casual observation it seems to me that most people working in tech have a theta calculation as to their creativity. I believe a strategy to guard against creativity decay is to look back on the past and critique the work.” In mid October I had not fully fused the thesis of creativity fail or creativity theta with doctrine. The idea to link the two concepts occurred to me last night as I was reading Shattered Sword for the second time.”
How is doctrine broken? Doctrine is what people believe in and act on. When intellectual thought leaders have the courage to lead, change can occur quickly. I told the audience at the CS conference that the migration from shared LANs to switched LANs occurred quite quickly and if you were not involved in the technology industry in the early 1990s you probably do not know about Cabletron, Synoptics, Bytex, Chipcom, Proteon, CrossComm, Xyplex, DEC, etc.
I started writing about the OPEX problem of networking last year. Here is a link to a post on OPEX, but I should state that the founder of Plexxi was thinking about this problem in 2009-2010 when he founded the company and if you look through the old Nicira presentations you can see they were highlighting the same problem set. See this presentation I did in January 2013 which is really a presentation on the origins of SDN and what does it mean. For this post, I am going to pretend that SDN does not exist. In fact, if networking did not exist at all and a group of us were put into a room and asked to create a network, I doubt that we would come up with anything remotely resembling what we have today. This might be a good exercise for IT architects to go through. How would you design a network today, forgetting the limitations of the last 20-30 years of networking? That is what we did at Plexxi and continue to do each day. That is point of this post. If I was to go design a system called a network to connect compute, storage and applications I would want that network to have a number of characteristics.
1. Network Must Express What is Important: Solving the internet and client/server challenges was about solving the problem of connectivity and reachability. Today, we are not solving the problem of connectivity and reachability, we are solving the problem of utilization and correlation. We need to correlate the utilization of the network with workloads that the network is tasked to support. We want the network to be orchestrateable by the application and the developer of the application. That means a developer can say that application A-B-C requires these sets of characteristics from the network such as low latency, jitter sensitive, bandwidth, hop count, path or what is called service chaining. A network then has the ability to take these requirements from many applications and calculate a topology that best reflects the needs of the applications.
2. Dynamic Network: In order to make the network orchestrateable, we need the network to be dynamic. Dynamic networks can be built in the wireless or optical domains. Physically wiring a network reinforces the hardwired, physically limited design of the network. When we wire up the conventional leaf/spine switched network design, we lock-in the network design the requirements on day 1. If the workload requirements of the network changes after the first day, we have to no ability to change how the network is configured.
3. Purpose Built for Automation: We want the network to be built for automation. We do not want to be configuring network elements at the port or the device level through CLIs. We want to compute topologies and script the network configuration.
4. Single Interface: We need to have centralized interface points that allow the network to be orchestrated. This is not a single point, but it is not every network element. We want to have orchestration from the top down to the device – not the device up.
5. API Driven: We want the network interface to be API driven. We want an open interface that allows software developers, server administrators (i.e. DevOps) people to orchestrate the network based on the needs of their applications and services. If we want to reduce OPEX, we need to remove the network as a silo that lags behind the dynamic nature of the other elements of IT.
6. Simplification: We want to build network designs that simple. A network can do two things: connect and disconnect devices. Simplified physical layer cabling via an advanced dynamic interconnect that translates into lower power, cooling and administrative costs.
7. Scale: We need a network that can scale concurrent with the evolutionary cycles occurring in compute and storage. The scaling problem is multi-deminsional.
Attacking the complexity and cost factors in running a network is multi-diemensinal. Servers are going from 1G to 10G. That means a network upgrade is necessary. The good news for customers is they have choice in how they design their network. It is probably the first time they have had choice in network design since the demise of ATM and shared LANs. For the past 16 years (or more) we have been designing switched hierarchical networks. Choice for the customer was really price as in what kind of OSR to use, who had the lower 1G and 10G switch ports, what kind of 40G density to go with, etc. The network was the same, despite the choice of vendor.
Customers can decide to build a Plexxi network. A Plexxi network is built to deliver a long term OPEX reduction and we can do it at scale. We have an API driven controller. We have a dynamic interconnect. Our standards based, merchant silicon based ethernet switch was purpose built for automation from the controller. A Plexxi network is simple. It is an optical fabric orchestrateable via the controller through APIs. It is built to correlate the workloads expressed to it from the applications via the API. In legacy networks, we spend a lot of time determining state, letting legacy protocols and QoS and firewalls figure out the topology. In a Plexxi network we figure out what is important via the application architect, calculate topology using that information and then program the network to configure 100% of the bandwidth. We do not want to have a network that is 20% or 40% utilized with one person per 100 devices. We think we can do a lot better than those metrics. We think we can build you a better network.
With many orgs having separate people in charge of the network and apps/virtualisation, I think it would have been great if there was an option to integrate OVS/NVP with Plexxi. OVS (ran by apps/virt ppl) would provide apps/virtualisation admins means to create/change their networks (overlays), while Plexxi (ran by networks ppl) would provide OVS with multitude of transport services (“such as low latency, jitter sensitive, bandwidth, hop count, path”), over which to run tunnels. Instead of actual applications talking to Plexxi diectly, OVS would talk to Plexxi on their behalf to get access to the transport with the characteristics the services it provides to applications need.
I realise this isn’t what Plexxi’s vision is, but I’m hoping you can see the point.
jumping ahead of Bill here, your thoughts are very much aligned with our thinking and the possibilities (dare I say vision) of the Plexxi solution. There will be overlay solutions and there is no reason that Plexxi cannot provide the exact same network services based on overlay endpoints (better yet, the overlay identifier) as an aggregate of applications…
To add to what Marten said… Virtual Networks are just another type of affinity group – a collection of virtual switches that are related… And as you said it would be really cool to understand tunnel needs by class of service and correlate that to physical network capabilities…
Thanks for your answers; glad to hear that we’re on the same page here!