What does VisiCalc and SDN Have in Common?

I was listening to an episode on Planet Money last week regarding the first spreadsheet program called VisiCalc. If you listen to the podcast there is a discussion of the accounting profession before and after the creation of the spreadsheet. Before the creation of the program VisiCalc a spreadsheet was really a spreadsheet.   “If you ran a business, your accountant would put in all your expenses, all your revenues, and you’d get this really detailed picture of how the business worked. But even making a tiny tweak was a huge hassle.” Teams of accounts and bookkeepers would spend days reworking sheets of papers to maintain the accuracy of the books. Continue reading

Future Generations Riding on the Highways that We Built

When I was in high school and college, I never thought about a career in networking; it was just something I did because it was better than all the other jobs I could find.  I worked at my first networking startup in the late ‘80s and twenty-five years later, I am still working in networking. Continue reading

Talking SDN or Just Plain Next Generation Networking…

Tomorrow in SF, I will be talking about SDN, or as I like to call it next generation networking at the Credit Suisse Next Generation Data Center Conference.  It will be a panel discussion and each participant has a few minutes to present their company and thoughts on the market adoption of SDN.  Explaining the next twenty years of networking in fifteen minutes is a challenge, but I have been working with a small slide deck that helps make the point.  Here are those slides (link below).  I posted a variation of those slides few weeks ago that I used in NYC, but I tailored this deck to strict time limit of 15 minutes.  I will post more frequently after Plexxi is done at NFD #5 this week and around the time of OFC.

CS Next Gen DC Conference

 

/wrk

Echoes from Our Past #2: The Long Tail

As with my first post about Antietam and the Vacant Chair, I have started to weave some creative writing into my technology and business focused blog.  If it is not for you, please disregard.  I am writing this post from the reading room in the Norman Williams library in Woodstock Vermont, which was built in 1883.  Outside the leaves are in full color and on the town green is chile cook off contest.  More than a decade ago, I started writing a book on my experiences reenacting the American Civil War.  I was motivated to write it because I had read Confederates in the Attic.  I knew some of those people in that book and had been at the same places.  Writing a book requires time and concentration. Continue reading

No Time to Post

I have had no time to post, but I have been working on a few drafts.  I spent today in Palo Alto at the JP Morgan SDN Conference.  It was a one day event with a lot of executives from SDN startups as well as established vendors in attendance.  I wrote a few pages of notes and I think I might be able to distill my thoughts down to post by the end of the week.  Thanks to the JPM team for the invite.

/wrk

Labor Day Weekend Posts

Rather than a long blog post, I wrote three short posts on subjects that seemed to occupy my conversations and email this past week.  Off to SFO again this coming week and right back to Boston as we are hosting customers at our Cambridge headquarters at the end of the week.

/wrk

* It is all about the network stupid, because it is all about compute. *

** Comments are always welcome in the comments section or in private. ** 

Dawn of the Multicore Networking

Off to VMworld, flying my favorite airline Virgin America, blogging, wifi, satellite TV and working; which is just cool when you consider when I starting traveling for business I had a choice of a smoking seat and a newspaper.  I recently posted additional SDN thoughts on the Plexxi blog, which was a follow-up to the last post on the SIWDT blog.

The following is the post that I alluded to last month that I have been writing and revising for a few months.   It all started several months ago when I was reading Brad Hedlund’s blog, in which he posted several Hadoop network designs.  I am referencing the post by Brad because it made me stop and think about designing networks.  I have been talking to a lot of people about where networks are going, how to design networks, blah, blah, just click on the “Networking” tab to the right and you can read more than a years worth of postings on the subject.  Side note, if you experience difficulties falling asleep reading these posts might serve as a cure.

As a starting point, consider the multicore evolution paradigm from a CPU perspective.  In a single core design, the CPU processes all tasks and events and this includes many of the background system tasks.   In 2003 Intel was showing the design of Tejas, which was their next evolution of the single core CPU with plans to introduce in late 2004.  Tejas was cancelled due to heat caused by extreme power consumption of the core.  That was the point of diminishing returns in the land of CPU design. At the time AMD was well down the path of a multicore CPU and Intel soon followed.

From a network design perspective, I would submit that the single core CPU is analogous to the current state of how most networks are designed and deployed.  Networks are a single core design in which the traffic flows to a central aggregation layer or spine layer for switching to other parts of the network.  Consider the following example:

  • 100,000 2x10G Servers
  • Over-Subscription Ratio of 1:1
  • Need 2,000,000 GbE equivalent = 50,000 x 40 GbE
  • Clos would need additional ~100,000 ports
  • Largest 40 GbE aggregation switch today is 72 ports
  • 96 ports coming soon in 2U, at 1-2 kW
  • 100k servers = 1,500 switches
  • 1.5-3.0 MW – just for interconnection overhead

This network design results in what I call the +1 problem.  The +1 problem is reached when the network requires one additional port beyond the capacity of the core or aggregation layer.

In contemporary leaf/spine network designs, 45 to 55% percent of the bandwidth deployed is confined to a single rack.  Depending on the oversubscription ratio this can be higher such as 75% and there is nothing strange about this percentage range, as network designs from most network equipment vendors would yield the same results.  This has been the basis of the networking design rule of: buy the biggest core that you can afford, scale it up to extend the base to encompass as many devices connections as possible.

Multicore CPUs come in different configurations.  A common configuration is what is termed symmetrical multiprocessing (SMP).  In a SMP configuration, CPU cores are treated as equivalent resources that can all work on all tasks, but the operating system manages the assignment of tasks and scheduling.  In the networking world, we have provided the same kind of structure by creating work group clusters for Hadoop, HPC and low latency (LL) trading applications.  The traditional single core network design that has been in place since IBM rolled out the mainframe has been occasionally augmented over the years with additional networks for mini computers, client/server LANs, AS400s and today for Hadoop, low latency and HPC clusters.   Nothing really new here because eventually it all ties back into and becomes integrated with the single core network design.  No real statistical gain or performance improvement is achieved scaling is a function of building wider to build taller.

Multicore CPU designs offer significant performance benefits when they are deployed in asymmetrical multiprocessing (AMP) applications.  Using AMP some tasks are bound to specific cores for processing, thus freeing other cores from overhead functions.  That is how I see the future of networking.  Networks will be multicore designs in which some cores (i.e. network capacity) will be orchestrated for HPC, priority applications and storage, while other cores will address the needs of more mundane applications on the network.

The future of the network is not more abstraction layers, interconnect protocols, protocols wrapped in protocols and riding the curve of Moore’s Law to build bigger cores and broader bases.   That was the switching era.  The new era is about multicore networking.  We have pretty much proven that multicore processing in CPU design is an excellent evolution.  Why do we not have multicore networking?  Why buy a single core network design and use all shorts of patches, tools, gum, paperclips, duct tape and widgets to jam all sorts of different applications through it?  I think applications and workload clusters should have their own network cores.  There could be many network cores.  In fact, cores could be dynamic.  Some would be narrow at the base, but have high amounts of bisectional bandwidth.  Some cores would be broad at the base, but have low bisectional bandwidth. Cores can change; cores can adapt.

I think traditional network developers are trying to solve the correct problems – but under the limitations of the wrong boundary conditions.  They are looking for ways to jam more flows, or guarantee flows with various abstractions and protocols into the single core network design.  We see examples of this every day.  Here is a recent network diagram from Cisco:

When I look at a diagram like this, my first reaction is to ask: What is plan B?  Plan B in my mind is a different network.  I fail to see why we as the networking community would want to continue to operate and develop on the carcass of the single core network if it is broken?   In other words, if the single core network design is broken, nothing we develop on it will fix it.  A broken network is a broken network.   Let the single core network fade away and start building multicore networks.  As always it is possible that I am wrong and someone is working on the silicon for 1000 port terabit ethernet switch.  It is probably a stealth mode startup call Terabit Networks.

/wrk

* It is all about the network stupid, because it is all about compute. *

** Comments are always welcome in the comments section or in private. **

SDN Thoughts Post CLUS and Pre-Structure

At present I am on a bumpy VA flight over Lake Erie inbound to SFO for the GigaOM Structure conference. My employer will soon be a little less stealthy as we will have a new web page mid week around the Structure conference. I plan to do some blogging during or post Structure, but before we get to those post(s) I thought I would offer a few SDN thoughts post CLUS. I think what SDN is or will become is still unknown and I struggle with the need to find the killer app. SDN has not be defined. I do not think is had been agreed upon. I know that I have some different views on SDN. I actually do not like the term SDN; as networks have always been software defined and I do not think it is as easy as 20 min talk about separating the control plane from the data path on a white board. I am not sure what to offer up as a better term or acronym for SDN, but I think that I offer a few thoughts on SDN that our CEO is presenting at a conference prior to the Structure conference:

– Computation and algorithms, in a word math. That is what SDN is.

– Derive network topology and orchestration from the application and/or the tenant.

– Ensure application/tenant performance and reliability goals are accomplished.

– Make Network Orchestration concurrent with application deployment. This is what SDN is for.

– A controller has the scope, the perspective and the resources to efficiently compute and fit application instantences and tenants onto the network

I know is this considerably less than 93 slides, so I am going to look to augment the previous five points in the future. At Structure, I will looking at what others are doing and listening to the broad ecosystems of competitors, customers and analysts. If you are out at Structure, feel free to stop by the Plexxi table and tell me I am wrong. I look forward to the discussion at Structure or one of the dinners I plan to attend. Side note…this is test of MarsEdit and the ability to post from a cross country flight.

/wrk

Service Provider Bandwidth: Does it Matter?

Let us start with a question: service provider bandwidth does it matter?  Perhaps a better question would be: is service provider bandwidth a meaningful problem to work on?  I think it does matter, but I am not certain it is meaningful.  This post is not going to be a scientific study and it should not be construed as a start of a working paper.  This post is really a summary of my observations as I am trying to understand the significance of the messaging in the broader technology ecosystem.  I sometimes call these posts framing exercises.  I am really trying to organize and analyze disparate observations, urban myths and inductive logical failings of doctrine.

Frame 1: Bandwidth Pricing Trend

There is no debate on this point; the price trend of bandwidth is more for less.  Bandwidth is deflationary until someone shows me a data set that proves it is inflationary.  I agree that bandwidth is not ubiquitous, it is unevenly distributed and that falls into the category of: life is not fair; get used to it.  In areas in which there is a concentration of higher wage earning humans organized into corporations with the objective of being profit centers, there seems to be an abundance of bandwidth and the trend in bandwidth is deflationary.  Here are a few links:

  • Dan Rayburn Cost To Stream A Movie Today = Five Cents; in 1998 = $270In 1998 the average price paid by content owners to deliver video on the web was around $0.15 per MB delivered. That’s per bit delivered, not sustained. Back then, nothing was even quoted in GB or TB of delivery as no one was doing that kind of volume when the average video being streamed was 37Kbps. Fast forward to today where guys like Netflix are encoding their content at a bitrate that is 90x what it was in 1998.  To put the rate of pricing decline in terms everyone can understand, today Netflix pays about five cents to stream a movie over the Internet.”
  • GigaOm: See the 2nd Chart.
  • Telegeoraphy: See the chart “MEDIAN GIGE IP TRANSIT PRICES IN MAJOR CITIES, Q2 2005-Q2 2011”

Frame 2 Verizon Packet/Optical Direction

Here is a presentation by Verizon at the Infinera DTN-X product briefing day.  The theme of the presentation is that the network is exhausted due to 4G LTE, video, FTTx, etc and that this is driving the need for more bandwidth to include 100G in the metro, 400G and even terabit ethernet in the core.  I have heard these arguments for terabit ethernet before; I am firmly in the minority that it is a network design/traffic engineering problem – not a bandwidth problem to be solved.  It took the world fifteen years to move from 1G to 10, I wonder how long it will take to get to terabit ethernet.

Frame 3 Are the design assumptions incorrect?

When I look at the network, I think of it as a binary solution set.  It can connect and it can disconnect.  For many decades we have been building networks based on the wrong design assumptions.  I have been posting on these errors in prior posts.  Here is a link to a cloud hosting company.  I know this team and I know their focus has been highest IOPs in their pod architecture.  We can use any cloud provider to make the point, but I am using Cloud Provider USA because of the simplicity of their pricing page.  All a person has to do is make five choices: DC location, CPU cores, memory, storage and IP address.  Insert credit card and you are good to go.  Did you notice what is missing?  Please tell me you noticed what is missing, of course you did.  The sixth choice is not available yet, it is network bandwidth; the on or off network function.  The missing value is not the fault of the team at Cloud Provider USA; it is the fault of those of us who have been working in the field of networking.  Networking has to be simple; on or off and at what bandwidth.  I know it is that simple in some places, but my point is it needs to be as easily configured and presented in the same manner as DC-CPU-Memory-Storage-IPs purchase options are presented on the Cloud Provider website.  My observation is the manner in which we design networks results in a complexity of design that is prohibitive to ease of use.

Frame 4 Cisco Cloud Report

I think most people have read Cisco’s Cloud Report.  Within the report there are all sorts of statistics and charts that go up and to the right.  I want to focus on a couple of points they make in the report:

  • From 2000 to 2008, peer-to-peer file sharing dominated Internet traffic. As a result, the majority of Internet traffic did not touch a data center, but was communicated directly between Internet users. Since 2008, most Internet traffic has originated or terminated in a data center. Data center traffic will continue to dominate Internet traffic for the foreseeable future, but the nature of data center traffic will undergo a fundamental transformation brought about by cloud applications, services, and infrastructure.”
  • In 2010, 77 percent of traffic remains within the data center, and this will decline only slightly to 76 percent by 2015.2.  The fact that the majority of traffic remains within the data center can be attributed to several factors: (i) Functional separation of application servers and storage, which requires all replication and backup traffic to traverse the data center (ii) Functional separation of database and application servers, such that traffic is generated whenever an application reads from or writes to a central database (iii) Parallel processing, which divides tasks into multiple smaller tasks and sends them to multiple servers, contributing to internal data center traffic.”

Here is my question from the above statistic.  If 77% of traffic stays in the data center, what is the compelling reason to focus on the remaining 23%?

Frame 5 Application Aware an the Intelligent Packet Optical Conundrum

I observe various transport orientated equipment companies, as well as service providers (i.e. their customers) and CDN providers (i.e. quasi-service provider competitors) discussing themes such as application aware and intelligent packet optical solutions.  I do not really know what is meant by the use of these labels.  They must be marketing terms because I cannot find the linkage between applications and IP transit, lambdas, optical bandwidth, etc.  To me a pipe is a pipe is a pipe.

The application is in the data center – it is not in the network.  Here is a link to the Verizon presentation at the SDN Conference in October 2011.  The single most important statement in the entire presentation occurs on slide 11 “Central Offices evolve to Data Centers, reaping the cost, scaling and service flexibility benefits provided by cloud computing technologies.”  In reference to my point in Frame 3, networks and the network element really do not require a lot of complexity.  I would argue that the dumber the core, the better network.  Forget about being aware of my applications; just give me some bandwidth and some connectivity to where I need to go.  Anything more than bandwidth and connectivity and I think you are complicating the process.

Frame 6 MapReduce/Application/Compute Clustering Observation

Here is the conundrum for the all the people waiting for the internet to break and bandwidth consumption to force massive network upgrades.  When we type a search term into a Google search box it generates a few hundred kilobytes of traffic upstream to Google and downstream to our screen, but inside Google’s data architecture a lot more traffic is generated between servers.  That is the result of MapReduce and application clustering and processing technologies.  This is the link back to the 77% statistic in Frame 4.  Servers transmitting data inside the data center really do not need to be aware of the network.  They just need to be aware of the routes, the paths to other servers or devices and they do not need a route to everywhere, just where they need to go.

Frame 8 What we value is different from what we talk about

Take a look at the chart to the left.  I put a handful of public companies on the list and I am not including balance sheets, debt and other financial metrics.  All I am pointing out is that companies focused on the enterprise (i.e. the data center) enjoy higher margins and richer valuations than companies that focus on the service provider market.  Why is that true?  Is that a result of the 77% problem?  Is it a result of the complexity of the market requirements imposed by service provider customer base?  Is it a result of the R&D requirements to sell to the service provider market?

Frame 7 Do We Need a New Network Architecture?

I have been arguing that we need a new network architecture for some time, but I think the underlying drivers will come from unexpected places.  It was not long ago that we had wiring closets and the emergence of the first blade servers in the early 2001-2002 time period started to change how data centers were built.  When SUN was at their peak, it was because they made the best servers.  It was not long ago that the server deployment philosophy was to buy the most expensive, highest performance servers from SUN that you could afford.  If you could by two, that was better than one.  The advent of cheap servers (blades and racks), virtualization and clustering applications changed the design rules.  Forget about buying one or two high end servers, buy 8 or 10 cheap ones.  If a couple go down on the weekend, worry about it on Tuesday.  I think the same design trend will occur in the network.  It will start in the DC and emerge into the interconnect market.

/wrk

* It is all about the network stupid, because it is all about compute. *

** Comments are always welcome in the comments section or in private. ** 

Going to start blogging again..

A little less than four years after I stopped blogging, I am going to start blogging again.  After a lot of consideration, the final motivation came from Andy Bechtolsheim.  I did not speak with Andy.  I did not even email with Andy.  I just watched some of his videos on YT and decided I needed to rekindle the passion for tech, networking and innovation.

/wrk