More than five years ago James Hamilton of Amazon fame, posted on his personal blog a presentation he gave about networking called Datacenter Networks Are in My Way. Here is a link to his post and my last check showed that the slides to the presentation are still available. I copied four of the slides in the thumbnail to the left to save the reader from a click out. Continue reading
A few weeks ago I posted a blog on what I have experienced over the past four years at Plexxi. That post led the Packet Pusher team of Ethan and Greg to reach out and we recorded a podcast about the changing role of the network engineer and IT silos. In preparation for the podcast, my colleagues Mat Mathews, Mike Welts and I collaborated on the following that I edited a final time after the podcast. This post started as a dialog about what we are seeing in the market, what our customers and prospects want to engage about, how we position Plexxi to the network engineer and where we see this all going now that market clarity has begun to emerge. Continue reading
Today is my four-year anniversary at Plexxi. I was in New York the week before Christmas to attend an investor conference focused on security and networking. It was a two-day trip that I expected to go by quickly as it was full of meetings and dinners. A colleague and I met with a number of crossover investors, analysts as well colleagues in compatriot companies. In our very first meeting an investor asked “four years in, how has it turned out compared to how you thought it would go when you started?”
The best quote in this article is “Everything made sense except that nobody gives a shit.” When I think about trends in the networking space over the past five years, that is how I would summarize most of the efforts labeled “disruptive” or “revolutionary.” When I can, I attend various local Meetups, which like a quasi-sales call. I get to hear end-users talking about what they are working on, what issues they are facing, etc. Meetups are kind of like fishing, some days they are a complete waste of time and other days you catch a lot of fish and in my world information is fish. I like to hear what end-users are saying, what they are working on and what keeps them up at night.
35,000 feet over Utah, one glass of scotch down, ear buds in and my internal notes sent; it is time to write some VMworld 2015 impressions for the blog. In no particular order, here they are: Continue reading
A few weeks ago I spent the morning in New York City presenting to a room full of people about networking. Networking is typically not a really big draw on a Friday in NYC during August, but the turn out was great and the morning was quite pleasant. Continue reading
It is Sunday morning and I am on a 7am flight to SFO from Boston. When I left the house, no person was stirring; not even the dogs. Sipping a morning mimosa or two on the flight to SFO, I read this article that I saw tweeted. The article is about work-life balance in the eyes of Pat Gelsinger and how tech companies overwork their employees. I found one quote very applicable to myself. Continue reading
Note to readers this is a self-promotional post. On August 14, Plexxi will be hosting a morning discussion in New York City at The Cornell Club, located at 6 East 44th Street. I will be the speaker for Plexxi. We will serve some food and talk networking for a couple hours.
The primary agenda will be around how to transition legacy networks to hyperconverged rack scale systems using a controller architecture, which is often referred to as SDN.
If you are interested in attending, please register here.
When should the incumbent pay attention to an upstart competitor?
A couple of weeks ago Arista Networks (ANET) reported their quarterly numbers and they were fantastic. No need to sugar coat the numbers, they were excellent. A decade ago I worked at Ciena and I remember when we started to see a new competitor in our market called Infinera. At the time Infinera came to market, Ciena was recording ~$100-120m quarters. Within a year of seeing them, Infinera was recording $8-20M quarters that had grown to $40M a quarter by mid-2006. Running a $500M business is vastly different than running a $150M business, which were the annual revenue runs rates for CIEN and INFN in 2006. Looking at the Arista results, the thought occurred to me to look back a decade. Continue reading
I am often asked what my opinion is of Cisco. Is it going out of business because of white box? Should they buy Arista? Should they buy NetApp or EMC or Citrix or RedHat? The news today that HP is going to break into two companies tells me that we have reached a point where it is difficult to grow large cap tech that have multiple business units. No CEO of a large cap tech company wants to be the AOL/Time-Warner of this market era. A few thoughts on the subject of large cap tech companies. Continue reading
I had a great week at VMWorld. The show was fantastic for Plexxi as we recorded 6x as many leads as last year, but the friction simmering in networking has emerged from behind closed doors and spilled out in full public view. Here are a few links if you missed what I am referring to: Continue reading
Earlier today I read this post titled “SDN is Not a Technology, It’s A Use Case.” Shortly after, I found myself in a conversation with one of our lead algorithmic developers. We were discussing recent developments in the deployment of photonics inside the data center and papers we had read from Google researchers. At Plexxi, we have already begun the thinking around what our product architecture will look like in 3-5 years. In the conversation I was having with the algorithmic developer, it occurred to me that we sometimes become so immersed in what we are doing on a daily, weekly, quarterly basis that we lose track of whether we are working on a project or building a company.
I had every intention of producing several long posts about ONS 2013, but events in my hometown coupled with a busy meeting schedule at ONS resulted in not finding a lot of time to focus on writing. I think my colleague Mike Bushong summed up much of my thoughts here and I would like to add a few other thoughts I have about SDN after ONS 2013.
This week at OFC, Plexxi and Calient are showing the power of SDN and optics. The idea to use some sort of optical or hybrid optical architecture for the data center has been pursued for years. Here is a link to a 2010 paper called, Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers, written by a number of people, but the most notable author is Amin Vahdat.
I was having a DM conversation (140 characters at a time) the other day with network architect. We were discussing the reluctance of networking people, especially at the CxO or leadership level to do something different. Personally, I have heard from ~50 people at the leadership level over the past 18 months that state they want to do something different with their network infrastructure. The network has not changed in twenty years and now the time has come to change the network. What is the result of all the pent up desire to do something different? More network incrementalism; at least in the near term. The DM conversation I was having was around the subject of getting network people to do something different. Why do people say they want to make big changes and fail to seize the day? That is the subject of this post.
Last year, I wrote a long post on doctrine. I was reminded of that post three times this week. The first was from a Plexxi sales team who was telling me about a potential customer who was going to build a traditional switched hieracrhical network as test bed for SDN. When I asked why they were going to do that, they said the customer stated because it was the only method (i.e. doctrine) his people had knowledge of and it was just easier to do what they have always done. The second occurrence was in a Twitter dialog with a group of industry colleagues across multiple companies (some competitors!) and one of the participants referenced doctrine as a means for incumbents to lock out competitors from markets. The third instance occurred at the at the Credit Suisse Next-Generation Datacenter Conference when I was asked what will cause people to build networks differently. Here are my thoughts on SDN, networking, doctrine, OPEX and building better networks.
I will be on the road a lot the next few weeks for Plexxi. Plexxi will also be engaging in a host of events as well. Here is a list of events:
- March 5: Credit Suisse Datacenter Conference in SF. I will be on an SDN panel with two friends from Big Switch.
- March 7: Plexxi presents at Network Field Day #5. I will be in NYC/NJ that day presenting to customers.
- March 13: I am on the Cloud & Software-Defined Panel at the Pacific Crest Technology Forum event in Boston.
- March 15: Plexxi and Boundary at SDN Central Demo Fridays.
Yesterday HP announced some SDN products to include a controller. If you had read my SDN Working Thoughts #3 post, then you already knew this data point. I have many questions about this announcement, starting with why would they announce an OpenFlow based controller when you can get one from Big Switch Networks (BSN)? I am sure there is a smart answer, but that is not my point. In addition to HPQ, IBM announced a controller using the NEC controller. My point is there is has been and continues to be a lot of development and design of controllers going on. My hypothesis is that the controller architecture will play a role as to where the battle of SDN market share will be won and lost in the coming years and simplification of the market into “separating the data plane from the control plane” is not specific enough and does not encompass a broad enough data set. I have written several times before SDN is more than APIs and reinventing the past thirty years of networking in OpenFlowease.
I think a person’s perspective of the controller is directly related to how you see the network evolving and how your company wants to run their business. There is no stand alone controller market. If I was to summarize the various views of the controller I would say that incumbent vendors view a third party controller as a threat and need to provide a controller as hedge in their portfolio in case it becomes a strategic point of emphasis. Incumbents really do not know what to do with a controller in terms of their legacy business, which is why they market a controller as some sort of auto-provisioning, stat collecting NMS on steroids. It will enable you to buy more of their legacy stuff, which for HPQ after today’s guidance cut may not be the case. The emerging SDN companies view the controller as point of contention for network control. All the companies in the market share labeled “other” or “non-Cisco” view the controller as a means to access the locked-in market share of Cisco. In the past, I would have told you that control planes have enormous monetary value if you can commercialize them inside customers. Cisco did this with IGRP, IOS, Cisco IOS and NX-OS. Ciena did this with the CoreDirector. Sonus failed to do this. Ipsilon failed to do this. Does anyone remember the 56k modem standard battle between US Robotics and the rest of the world who were working on the 56k standard and who won that market battle? The question becomes over the next year or two is how many controllers become commercialized in the market place and what are these controllers doing? I think there is a difference between controllers doing network services and controllers providing network orchestration based on application needs.
The following quote is from Jim Duffy’s article in Network World on HP’s controller announcement:
“HP’s Virtual Application Networks SDN Controller is an x86-based appliance or software that runs on an x86-based server. It supports OpenFlow, and is designed to create a centralized view of the network and automate network configuration of devices by eliminating thousands of manual CLI entries. It also provides APIs to third-party developers to integrate custom enterprise applications. The controller can govern OpenFlow-enabled switches like the nine 3800 series rolled out this week, and the 16 unveiled earlier this year. Its southbound interface relays configuration instructions to switches with OpenFlow agents, while it’s northbound representational state transfer interfaces — developed by HP as the industry mulls standardization of these interfaces — relays state information back to the controller and up to the SDN orchestration systems.”
Reading Duffy’s description I think the SDN orchestration system (is that application orchestration?) is more valuable than the controller he describes, but that is a side discussion. I also took the time to read this blog post from HP. Much of this controller architecture discussion has been on my mind for months as well as in my day to day work conversations for the past few months. It seems a day cannot go by without a conversation on this matter. I have no conclusions to offer in this post, so if you are looking for one please stop reading. The point of this post is that controller architecture, controller design and how SDN will evolve is in process and I think it is little early to be declaring the availability of solutions that offer marginal incremental value at best. The evolution of the controller thought process can be summarized at a high level by the following:
- Wired Article from Apr. 2012
- Urs Hoezle’s presentation from ONS in 2012
- Google A Software Defined WAN Architecture (81 Slides) from ONS 2012
- Martin’s Blog
From the Martin’s blog in the section on General SDN Controllers:
“The platform we’ve been working on over the last couple of years (Onix) is of this latter category. It supports controller clustering (distribution), multiple controller/switch protocols (including OpenFlow) and provides a number of API design concessions to allow it to scale to very large deployments (tens or hundreds of thousands of ports under control). Since Onix is the controller we’re most familiar with, we’ll focus on it. So, what does the Onix API look like? It’s extremely simple. In brief, it presents the network to applications as an eventually consistent graph that is shared among the nodes in the controller cluster. In addition, it provides applications with distributed coordination primitives for coordinating nodes’ access to that graph.”
Regarding, ONIX here’s a brief summary of the architecture but you can read a paper on it here and note who the author’s are and where they work:
- Centralized approach. Central controller configures switches using either OpenFlow along with some lower-level extensions for more fine grained control.
- Default topology is computed using legacy protocols (e.g. OSPF, STP, etc.), or static configuration.
- Collects and presents a unified topology picture (they call it a network information base – NIB) to Apps that run on top of it.
- Multiple controllers (residing in Apps) are allowed to modify the NIB by requesting a lock to the data structure in question.
- Scalability and Reliability:
- Cluster + Hierarchy of Onix instances, but NIB is synchronized across all instances (e.g. via a distributed database). For the hierarchical design, there is further discussion on partitioning the scope and responsibilities of each Onix instance.
- Transactional database for configuration (e.g. setting a forwarding table entry), DHT for volatile info (e.g. stats). Lot of focus on database synchronization and design.
- Example of “apps” mentioned in the paper:
- Security policy controller
- Distributed Virtual Switch controller
- Multi-tenant virtualized datacenter (i.e. NVP)
- Scale out BGP router
- Flexible DC architectures like Portland, VL2 and SEATTLE for large DCs
Combining the info from multiple sources, Google uses ONIX for a network OS (see the link to the ONIX paper above). ONIX appears to be Nicira’s closed source version of NOX, and both Nicira and Google use it. NEC has something called Helios that involves OpenFlow, which noted above was OEMed by IBM. I not sure about HPQ and their recent controller announcement, but I think it serves us well to understand the history of the ONIX architecture.
- ONIX users think that fast failover at the switch level while maintaining application requirements is a hard problem to solve. They think it is better to focus on centralized reconfiguration in response to network failures.
- ONIX synchronizes state only at the ONIX controller
- ONIX wants to use multiple controllers writing to the network information base interface and probably to any table in any switch
Is ONIX a direction for some OpenFlow evolution or a design point? I think one of the early visions for OpenFlow and ONIX was for it to become a cloud OS, which it has yet to become, but others are trying. The evolution of OF/ONIX vision looks something like this:
- Build a fabric solutions company with software and hardware, which is largely about controlling physical switches with OpenFlow (Read NOX paper here)
- Build a commercial controller (ONIX) and sell it as a platform product to a community of applications developers
- Build a network virtualization (multi-tenancy through overlays…this is the part where Nicira renames ONIX to NVP?) application that happens to embed their controller (formerly ONIX). Control the forwarding table with OpenFlow and every other aspect of overlay implementation using OVSDB protocol talking to OVS (it is largely about controlling virtual switches with only a pinch of OpenFlow).
- Nicira purchased by VMWare for their general expertise in SDN and for future applications of the technology assets (VMWare today ships a virtualization/overlay solution using VXLAN that does not include any Nicira IP).
It will be interesting over the next year or so to see how the architecture of the controller evolves. I wrote about some of this in the SDN Working Thoughts #3 post. I think we are coming to an understanding that there are variations to just running a controller in band with the data flows. I think we will conclude that having a controller act as session border control device translating between the legacy protocol world and the OpenFlow world is also a non-starter, but this is the current hedge strategy of most incumbent vendors. As the world of SDN evolves, we will look back and see the path to what SDN has become by looking at the failures as proofs along the way. The industry will solve the scaling and large state questions, but I think the solutions will be shown to exist closer to the hardware (i.e. network) than most envision in the pure software only view.
In a prior post I had made a reference to an article that was partially inspired by a post by Pascale Vicat-Blanc on the Lyattis blog. The Lyatiss team has been working on a cloud language for virtual infrastructure modeling. In particular, it generalizes the Flowvisor concept of virtualizing the physical network infrastructure to include application concepts. I am not sure of the extent of their orchestration goals. Do they expect Cloudweaver to spin up the VMs and storage, place them on specific servers, configure the network to satisfy specific traffic engineering constraints, and finally tear down the VMs? I am not sure. With Nicira now part of VMWare what is the future for NOX/ONIX and will other companies be innovators or implementors?
There is another potential market evolution to consider when we think about the controller. The silicon developers are looking to develop chips that disaggregate servers into individual components. The objective is to make the components of the server, especially the CPU upgradable. Some people have envisioned this type of compute cluster to be controlled by OpenFlow, but I think that is unlikely. Network protocols will be around for a very long time, but putting that aside, the question is what does this type of compute clustering do for the network? How much server to server traffic stays in the rack / cluster / pod / DC? I am not sure how much of this evolution will have to do with OpenFlow, but what I do know is that this type of compute evolution will have a lot to do with SDN, if you believe that SDN is about defining network topologies based the needs of the applications that use the network.
In a true representation of the title, this post is just some working thoughts on SDN with hypotheses to be proven. Comments and insights welcome…
From my perspective, I thought the show was great. We had a lot traffic in our booth, but the traffic was mixed. A good number of clients we have been talking to came by and we had a number of new engagements. All very positive. On the downside we had a lot of visits from incumbent networking companies. I did not know there were so many PLM and technical marketing people assigned to attend VMworld.
I had one interaction with a VP (Bus. Dev / Strategy / Strategic ) from one of the big incumbents. We both said hello. He asked me what Plexxi did. I said we are an SDN startup. He laughed and said we are all SDN now (apparently the Nicira acquisition by VMWare was some sort of conversion event for the networking industry), but what does Plexxi do he asked. I said we are focused on the data center. Okay, but what do you do he asked again. I said we were competitors and I was not going to provide details. He said okay and walked away.
Kind of interesting he did not know what we did when I think you can figure it out in about 15 mins of reading on the web. We were also showing the product in the picture in our booth, which attracted a lot of attention and we had a demo of Affinity networking if you cared enough to take the time. My take away was if someone wanted to know about a company on the show floor like Plexxi, they should do some research before the show and know exactly what to look for and ask about in the Plexxi booth. Having worked in three public companies, two of which we are large tech companies, it reminded me how inward focused many incumbent companies can be; it is as if the outside world does not exist. By the way, the last point about doing the research before the show, goes for candidates interviewing for a position too. If you do not care enough to do the research, how can I believe you will care enough to be on the team.
One final note to a reader of my blog. I was very humbled by a potential client who visited our booth and promptly told our CEO that he reads my blog and recently sent my Some Working Thoughts on SDN post to the IT team at his firm.
Six sales calls of which two were multi-hour product demonstrations as well as side meetings, dinners, and driving between SFO and SJC four times equal? Tired. Now I have the hardest one hour wait, which is the hour I have to wait till the flight home departs. It is not the redeye that is especially difficult, it is time waiting for the boarding process to start that seems to make time slow. I had an amazing couple of days in SJC/SFO talking data center architecture and SDN with prospective clients, industry luminaries as well as colleagues in fellow SDN startups. The last few days will be time we all look back on as the most fun in the life of a startup. I was with a great team rushing from appointment to appointment, lugging a SDN network in a 200 pound from location to location. At one customer, they even came out to the parking lot to look at the equipment in the back of van to see if it was real before we lugged it to a lab twenty miles up the road.
Now it is time to change coasts and check in with team at headquarters.