Running to Stand Still
For me the last several weeks of 2014 had been running to stand still. I made one last sales call before Christmas Eve and then eased into a long beak till the New Year. I had some interesting sales calls over the past year. I wrote about the perfect Clayton Christensen, hydraulics versus steam shovels moment here. I learned a lot from that sales call and went back to using a framing meme we had developed a couple years earlier. That meme I posted in this blog here, seven months ago. In this post I am refreshing that meme and highlighting a few insights I read and thought were meaningful. Most if not all of the mainstream tech media is some technology company’s marketing message in disguise; hence it might be entertaining, but it is not informative and or thought provoking.
Networking Has Not Really Changed in Twenty-Years and Most People Prefer it that Way
Imagine if you worked in a data center in 2004. You had just read Nicholas Carr’s “IT Doesn’t Matter” paper and became profoundly depressed with idea of working for the big IT utility in the sky. You think Carr is correct and decide on a mid-career change. You like coffee, warm weather, beaches and the BVIs are your choice to start a boutique coffee shop. Eleven years later the coffee sabbatical has ended and return to your old job.
When arrive back to your old job in 2015 what would be different? The first blade servers were emerging when you left and today, servers are all about high density, multi-core, bare metal systems at a fraction of the cost. Why buy one server when you can by 10 or 20 for the price of one in 2004? Gone is all the proprietary SUN hardware and the X86 has taken over the server landscape. Server virtualization was nascent in 2004, today a server admin can make a 1000 VMs fly up and dance across the screen.
Storage was about big chassis of spinning disks, block storage, high-availability systems that look like a Sub-Zero refrigerator. Data migration between storage tiers was difficult. Today, storage is about flash tiers and hybrid/flash storage systems with vastly more storage than anyone thought possible in 2004. Many customers are deploying JBOD. In ten years the mind-set of the storage leader went from persistence and availability to performance and let it fail (e.g. flash) I will buy more. Now emerging is a new performance dynamic driven by flash. Flash storage drives greater IOPs at lower latency and this has a profound effect in terms of east/west traffic stress on networks. As flash grows, bandwidth requirements will significantly increase driven by the changing nature of applications and how they use compute and storage.
In 2004 the network was about stacking switches on top of switches and buying the biggest core switch you could afford. Eleven years later the network is about stacking switches on top of switches and buying biggest spine you can afford. To make it even more fun you can choose to wrap some protocols with some protocols. Reality is that in 2015, the blueprint of the network looks very much like the network blueprint of 2004 for 99% of end-users. I would argue that with just a few days of a tech refresh, a network engineer from 2004 could be productive in 2015. The big difference between 2004 and 2015 is the server, application and storage teams left the world of scarcity and entered the world of plenty. The network team is still in the land of scarcity concerned about link failures, redundant paths, state distribution, CLIs and using ping to see if a link is up. The general operating rule for the network was said no better than by Andy Bechtolsheim, founder of Arista Networks in April 2012: “Reality is people spend a lot of money on networking gear, once it is installed it works. Don’t touch it, it may break it. Once people have adopted a certain network topology, they are highly unlikely to change it. Unless something really better comes along.”
The server and storage teams build out in the land of plenty and can handle outages on their schedule, while the network teams type cryptic commands into CLIs are 3am. Most networking people I present to have very little knowledge that networks can be built differently. For the most part the last 10-20 years have been about incrementalism in terms of network design. We collapsed some tiers, added some protocols and fattened some pipes, but we did not change the game. We now have an opportunity to construct the network as a pool of resources with location ubiquity; more on that in a future blog post.
Networks Have Gravity – Just like Data and Planets
Here is a nice blog on data gravity or the mass of data sets. Reading the above quote and spending a lot of time talking to people who build and operate networks for living, it became apparent to me that networks have the same gravitational force as data. The more mass users have built around their networks, the harder it is for them to have agility in their network. This is a very good position to be in if you are the incumbent supplier and a bad position to be if you are a new entrant. Networks have a fragmentation metric. Six, twelve, eighteen months after they are built out the additional compute and storage resources being added to the network are no where close to the pools of resources built on day one. The day 365 or 545 problem is difficult for the inflexible mono-culture network we have been building for decades to address. This is also where I see the entry point for SDN. SDN or the use of the controller architecture, is not a replacement for the network we use today, it is an addition, an entry point. To build a new network we do not start with a green field, we start with little patch of green in the brown field.
Out of the Miry Clay
We are more than a month into 2015 and I have been a lazy blogger, below are my short thoughts on networking to start the year.
- Forget about SDN: The last five years have been valuable as a discussion for learning, but they have been a failed proof. Any argument otherwise is false. I think it is great that there is a lot of energy around SDN, but I have no idea what a successful proof of SDN looks like. I know that SDN implies the use of a controller, but what do these controllers do and no two controllers seem to be alike. The reality of the market is that many of the early SDN proofs resulted in an increase in complexity and the complexity problem became worse as scale increased. In order to reduce complexity, teams working on SDN scaled back the size of the deployment window and the result was a non-compelling application. Adding SDN to the same twenty-year network paradigm, subject to the limitations of the technology concepts pioneered before we went to the moon, was not a path to success.
- Bare Metal: I think bare metal is interesting evolution, but I view as further proof that we need to build new and different networks. The idea around building bare metal presupposes that the basic blue print of the network that we have been building for the past 20-30 years will remain unchanged. If we cannot build a better network, the next best step is to find a way to decouple the components, drive cost out of the model and increase the velocity of deployment. That makes perfect sense to me and I am sure it makes sense to most networking people because they have only seen one type of network their entire career.
- Web Scale Confusion: Some good news in networking is the confusion around web scale is on the wane. For a good 18 months there were too many people with a decidedly small knowledge pool running around with PowerPoint slides telling others to build like web scale companies. Reality is returning to the networking narrative within the 99% end user base. We should know that billionaires are not very skilled at telling poor people how to manage their money.
- Networking is Different: When I hear people tell me about network commoditization and that switches will be like servers, I ask them to show me the evidence. The usual response to point to Google or some presentation from a person who does not build networks or networking products. I will eagerly read the quarterly results from Cisco and Arista, but so far I have not seen any margin erosion or signs of commoditization. In June 2013, I wrote the following, which I copied from Plexxi’s founder Dave Husak:
Network switches are not like disks and servers. Disks and servers are not the same either. The cheap-and-cheerful [i.e. DIY] disposable hardware model only works:
…for servers if the workload is fluid
…for disks if the data is fluid
…for switches if the capacity is fluid
The word fluid is being used as a condensation/conflation of: replicated, replicable, re-locatable, re-allocatable. Interesting that when you think about it this way, virtualization is not a means, but a result of fluidization. Pointedly, network virtualization does not make capacity fluid — it makes workloads fluid. If workloads are fluid, it would be helpful to have fluid network capacities to allocate to the demands of the workloads.
- DevOps: We have had what feels like hype-cycle around DevOps over the past 18-24 months. My perspective is there exists large disconnects in the market regarding DevOps and I think this is the specific the part of the IT landscape that networking can help. Networking is the piece that ties compute and storage together. If we step back and think about how Hadoop, which I am using as a generalized term for clustering, multi-threaded apps, is being used in the content, ad-tech and research markets, it is being used as a tool for the continuous refining of data. This refinement is a process. The refining of data is like the refining of crude into consumable products. Some datasets require more refining than other datasets to produce products that yield high value. Early adopters are clearly implementing a corporate refining process of data. This is the entry point for DevOps into the mainstream IT market. DevOps is part of the data lifecycle management, or refining process of data to yield distilled datasets that can be monetized. It does not matter what you call it: agile IT, lean IT, it is the fusing of the process into workflow of the corporation that is important. I do not think that this fusion can fully occur or be maximized until the network is made on par with the nature of applications, compute and storage. That is also the entry point I see for SDN.
- A Working Definition of SDN: When I talk to people about SDN, I simply refer to it as the implementation of the controller methodology to leverage the merchant silicon curve. Specific to Plexxi it also leverages a photonic performance and cost curve. Software is the binding agent between the electrical and photonic domains in a Plexxi switch. In terms of DevOps, the Controller becomes the point of integration of metadata from the world outside of the network. By definition the network was designed to be self-sufficient. I call this the solution to the Internet problem of reachability and unknown state. That was great for the era from 1974-1994, but we are not really solving those problems anymore. We should not constrain ourselves to point-to-point engineering, we should open ourselves up to the world of diversity and plenty. To become fused to the corporation, the refining process of data must begin to reflect the workflows of the corporations. This is why we need a new network, not merely a new set of tools for working on the archeological leftovers from a prior era. We need to build a network to be a pool of resources that can be machine orchestrated. It is at this point and only at this point that the workflow of the corporation changes and new cost paradigm can be realized.
Some Interesting Quotes
I am not stating that I believe, endorse or concur with these quotes, but I did find them interesting.
- “Society is adapting to the universal computing infrastructure—more quickly than it adapted to the electric grid—and a new status quo is taking shape. The assumptions underlying industrial operations have already changed. “Business processes that once took place among human beings are now being executed electronically,” explains W. Brian Arthur, an economist and technology theorist at the Santa Fe Institute.” – Nicholas Carr
- “Over the next three years, 70% of large and mid-sized organizations will initiate major network redesigns to better align inter-datacenter and datacenter-to-edge data flows.” – IDC 12-2014
- “Over the next two years, over 60% of companies will stop managing most of their IT infrastructure, relying on advanced automation and qualified service partners to boost efficiency and directly tie datacenter spend to business value.” – IDC 12-2014
- “These new applications are more than applications, they are distributed systems. Just as it became commonplace for developers to build multithreaded applications for single machines, it’s now becoming commonplace for developers to build distributed systems for data centers.” – Benjamin Hindman
- “From an operator’s perspective it would span all of the machines in a data center (or cloud) and aggregate them into one giant pool of resources on which applications would be run. You would no longer configure specific machines for specific applications; all applications would be capable of running on any available resources from any machine, even if there are other applications already running on those machines. From a developer’s perspective, the data center operating system would act as an intermediary between applications and machines, providing common primitives to facilitate and simplify building distributed applications.” – Benjamin Hindman