Framing Exercise:::What if we Turned the Network Off?
While I was out at VMWorld, I was telling my colleague @cloudtoad about how I learned to sell multi-protocol networking to SNA shops. This conversation started me thinking about network. Since that conversation, I have been reading some recently issued RFXs that we need to respond to and it led me to an interesting framing exercise that I thought I would share on the blog.
When I started in networking, we had glass houses where the mainframes resided and wiring closets was where the hubs and servers resided. My career in networking started at the transition point from the mainframe to the client/server network. I learned to sell to the networking person by giving them what they wanted: control. In the early days of client/server transition, my main customers were banks and insurance companies. It occurred to me one day that the network people deeply appreciated having control of the network.
The genesis of the control thought was observing a group of us going into a customer’s glasshouse. The customer I was at was growing quickly they had built cubes for the rank and file employees outside the glasshouse and moved IT guys into offices inside the glasshouse. I could tell all the IT guys enjoyed having special, secured access to their offices. It occurred to me that this was similar in all my accounts. Where did the IT and network teams sit in the building? They were usually located in a secured facility behind doors with keypads that was accessible to a very few number of employees. The message this gave to the rank and file employees of the firm was “we the IT guys are important, so we are secured in our own facility and we will grant you access privileges if necessary.”
Before having a sales role, one of my early career responsibilities was running a corporate network and this corporate network had Novell servers. One night I turned up a new server. Anyone with experience from this era remembers the big orange box and the boxes of disks inside. No Internet download back in the day. When I got the server turned up, moved over some data, I then had to create the user accounts. As I was creating the user groups, I noticed that Novell had emulated time based access windows for users. This was a feature hold over from the time sharing days of the mainframe and mini, but it naturally made sense to me as a network administrator to limit access to the network; so I did. That was a mistake. I was yelled at by the VP of Engineering the next day because I had restricted access to the servers after 8pm and he wanted to know why. My natural response was to ask why people needed access to the server after 8pm? I had the responsibility to secure the network so I created tiered levels of access. It seemed to be a perfectly logical course of action to me. When the yelling was done, I found myself in the server room removing all the access restrictions. What I realized was that it felt good to restrict access and bad to grant unlimited access. No one every thanked me during the day when the network worked day after day, they only yelled at me when it did not work.
I started to use the theme and messaging of control when selling to the SNA and network teams. I was working at CrossComm and our product was at first a bridge, then a router, but we had developed many specialized features to support SNA networks, which was our target market. I learned to sell to the IBM buyer in a target account by saying “I know all these LANs are popping up in your company for apps you do not think are mission critical, buy our gear and we will protect and secure your SNA traffic so it is not affected by all the lower priority LAN traffic. I understand the millions of dollars your firm has spent on your mainframes, do not let some software you can buy at Egghead Software cause you some downtime.” I was selling the dream of power and control through access to the network and this was a theme that worked.
More then twenty years later I am standing at VMWorld 2013 talking to @CloudToad, which is an interesting observation on the passage of time because I did not foresee the advent of Twitter in the late 1980s and I clearly did not foresee that I would be speaking to a grown adult male who is also a father referring to him in this blog post as a Avatar using his Twitter handle. @Cloudtoad has run large networks and I was talking to him about how I would sell him a new network if he was still an end-user and what works and does not work in the selling process. We were talking shop while killing some time. One of points I made was we have been building networks around the concept of reachability and interoperability for 30-40 years and that networking is hard, mainly because we make it hard. We turn everything on and proceed to turn things off by using QoS, firewalls, VLANs, DPI, load balancers, etc. Then the thought occurred to ask what would happen if we revered this process? What if the network was off by default and we turned it on based around services, workloads and applications?
In a Plexxi network, we build the network topology using a controller. This should not be news to anyone. We have a default forwarding topology that acts like a plain old network from past 30 years; no surprises, no magic. The magic starts when we begin to Affinitize workloads and applications. When we Affintize a workload or application, we are declaring that there is some constraint or characteristic that is important about it. Once we do that, we can then use the Controller to compute new forwarding topologies that fulfill the requirement requested for the workload. The more we Affinitize, the better we can calculate the forwarding topologies. Network traffic that is not Afifnitized still traverses the network using plain old networking.
Part of my day job involves talking to service providers about NFV, OTT services and invariably net neutrality is part of the conversation. I think most service providers would really enjoy turning off the network to third party services that do not bare the cost to provision and maintain the network on a daily basis. I understand that position, I have run a network too and the network administrator/operator rarely hears from anyone until the network is not working. In an age of PRISM, Anonymous and a constant barrage of news about cyber threats, I starting thinking about the ramifications of turning the network off. The best security in the world is the state of off. After all, a network can do one of two functions: connect and disconnect.
A friend called me the other day who is also a venture capitalist. He asked me if I had looked at Bromium. Bromium is a novel idea and it tends to go against the grain of established security procedures, polices, techniques from past. Without starting a discussion about Bromium, I starting thinking about their isolated VM container concept and turning the network off. The networking is a capacity container and we use it to traffic engineer connectivity between service containers, something like that. If the network started in the off mode and we started engineering access based on applications and services, this definitely changes the construct of security, workloads, load balancers, firewalls, DPI, etc. After all, we are still working in an era in which prominent bloggers in the networking world think Ping is the best trouble shooting tool. I know this is a crazy idea and I took it as far as describing it as an extreme network design approach internally at Plexxi. As networking people we have been trained to get connectivity first, hence the love of Ping, but if we asked the question: how do you want the capacity of the network applied in terms of your workloads and applications, I think this significantly changes the dialog. This has nothing to do with overlays and underlays, but has everything to do with building a networking around the critical needs and empowering physical security – the OFF button.
The concept I am writing about in this post should not really be that radical. Recently Google updated their experiences using SDN to traffic engineer network capacity on their B4 network. The paper is here if you want to read it. There is a lot in the paper, but I think this one quote is relevant to this blog post:
Our decision to build B4 around Software Defined Networking and OpenFlow  was driven by the observation that we could not achieve the level of scale, fault tolerance, cost efficiency, and control required for our network using traditional WAN architectures. A number of B4’s characteristics led to our design approach:
- Elastic bandwidth demands: The majority of our data center traffic involves synchronizing large data sets across sites. These applications benefit from as much bandwidth as they can get but can tolerate periodic failures with temporary bandwidth reductions.
- Moderate number of sites: While B4 must scale among multiple dimensions, targeting our data center deployments meant that the total number of WAN sites would be a few dozen.
- End application control: We control both the applications and the site networks connected to B4. Hence, we can enforce rel- ative application priorities and control bursts at the network edge, rather than through overprovisioning or complex func- tionality in B4.
- Cost sensitivity: B4’s capacity targets and growth rate led to unsustainable cost projections. The traditional approach of provisioning WAN links at 30-40% (or 2-3x the cost of a fully- utilized WAN) to protect against failures and packet loss, combined with prevailing per-port router cost, would make our network prohibitively expensive.
There is another aspect to consider. It is much harder for Service Providers to turn the network off because they are in a regulated business of delivering a service. There have been attempts with DPI and usage caps, but this always leads to a dust up around net neutrality. I spent too much time blogging on this in the past so I all not cover it in this post, but you can find posts to read here, here, here and here. In the enterprise network, as Google demonstrated, they can restrict access and build topologies around application buckets for traffic engineering because they are not selling a service. The idea of turning the network off does have ramifications for many players in the IT landscape and from a service architecture approach, it looks like Doomsday Machine for a large ecosystem.
President Merkin Muffley: But this is absolute madness, Ambassador! Why should you *build* such a thing?
Ambassador de Sadesky: There were those of us who fought against it, but in the end we could not keep up with the expense involved in the arms race, the space race, and the peace race. At the same time our people grumbled for more nylons and washing machines. Our doomsday scheme cost us just a small fraction of what we had been spending on defense in a single year. The deciding factor was when we learned that your country was working along similar lines, and we were afraid of a doomsday gap.
President Merkin Muffley: This is preposterous. I’ve never approved of anything like that.
Ambassador de Sadesky: Our source was the New York Times.
President Merkin Muffley: How is it possible for this thing to be triggered automatically and at the same time impossible to untrigger?
Dr. Strangelove: Mr. President, it is not only possible, it is essential. That is the whole idea of this machine, you know. Deterrence is the art of producing in the mind of the enemy… the FEAR to attack. And so, because of the automated and irrevocable decision-making process which rules out human meddling, the Doomsday machine is terrifying and simple to understand… and completely credible and convincing.