Waiting on the Exaflood
I was just down in my basement looking at the area around my FiOS FTTH equipment (pic below). I have been expecting a flood of data any day now. I have been so concerned about the impending exaflood that I have been considering a home defense. What would it cost to build Petabyte Wall (1000 TBs) to handle in the incoming flood of data? There have been plenty of warnings of the impending event.
If I wanted to build my PB Wall out of 32GB flash drives, I would need 31,250 sticks. That would set me back ~$2M assuming I could get a volume discount. There are some nice 12TB NAS systems, so I thought that 84 of those systems for $124k might be a better option. Google is offering 16TB for $256 per year, so that would set me back $256k for my PB Wall, but the problem with this option is it is from Google’s personal storage product and I need a commercial class solution – not a digital locker for files. Here are four options for my PB Wall assuming I could direct the flood bytes to the storage options:
|Solution||Price Per GB||Price Per TB||PB Wall Total Cost|
|32 GB Flash Drives||$2.34||$2,340.00||$2.34M|
|12TB NAS (Seagate)||$0.12||$124.91||$124k|
|Google Storage (Developers)||$0.17||$170||$2.04M (1 Yr)|
|Amazon S3||$3.72||$3,726.49||$3.72M (1 Yr)|
A few items to note in my high level quest to build a PB Wall. Amazon and Google both have upload/download transactions costs. I calculated the cost to fill a TB with the Amazon pricing tool; Google numbers do not include this cost. Google charges $0.10 per GB for upload and $0.15 per GB for download for Americas and EMEA. If you are in APAC make that $0.30 per GB for download. Google and Amazon also charge $0.01 per 1000 PUT, POST, LIST, GET and HEAD requests. Note those are transaction costs for compute, which is a reoccurring theme in my blog. To upload a TB to Google per month it would cost me $100k or an additional $1.2M per year putting the total Google cost for my PB Wall from Google at $3.24M. Throw in the some more charges and Google and Amazon are pretty close. My total costs do not include power and cooling for my in home NAS and flash drive solutions, not mention the time it would take to figure out how to wire 31,250 flash drives together.
Where is the Exabyte Flood?
I am not going to take the time to critique the various predictions from 2007-2008 about the impending exaflood and the internet breaking. These types of hyperbole always lack hubris and neglect to correct for black swans and human adaptation. That is what networks do. Networks adapt because they are managed by humans. Humans adapt. Not a lot of people where talking about broadband usage caps back in 2007-2008. Not many people thought that service providers would throttle bandwidth connections. Verizon FiOS offers storage with their internet service for $0.07 per GB per month or $0.95 cents per year. My PB Wall from Verizon using my FiOS service would cost me an additional $950k per year.
If I was provide a short answer to the complicated question of the exaflood, I would really say the answer lies in Shannon-Hartley theorem and that this law from 1927 will have more to do with network design and build-out over the next decade than life before or after television. In the past, it was easier to deploy more bandwidth to obtain more capacity. Buy another T1, upgrade to a DS3 and get me a GbE or more. Today we are approaching the upper end of spectral efficiency and this is going to force networks to be built differently. As I stated in a prior post I think that transmission distances decline, more compute (i.e. data centers) are put into the network and bandwidth limiting devices like ROADMs and routers/switches that have an OEO conversion will go away probably on the same time line as the TDM to packet evolution. This means the adoption rate is slower than first predicted and just when despair at the slow adoption takes hold the adoption rate rapidly increases and continues to gain momentum as the new network models are proved in.
The Network Always Adapts
The other assumption missed by the exaflood papers was that the network adapts. It just does. People who run networks put more capacity in the network for less money, because the people who build the network infrastructure are always working to improve their products.
One market effect I know is that when discrete systems in the network become harmonized or balanced, there is a lot of money to be made. Look at the fibre channel market. When adapters, drives and fabrics converge around a performance level like 2GB or 4GB, the market becomes white hot. The same goes for the 1G and 10G optical transmission markets. Today we are a maturing 10G market, there is a transition market called 40G, but the real magic is going to happen at 100G. At 100G huge efficiencies start to occur in the network as it relates to I/O, compute process, storage, etc. With the building of huge datacenters, how much bandwidth is required to service 100k servers? These large datacenters are being built for many reasons such as power, cooling, security, but the one reason that is often not quoted is processing and compute. There have been really innovating solutions to the compute problem that I wrote about before such as RVBD and AKAM. I look at what the Vudu team did for meshed, stub storage of content on a community of systems. Is this a model for future wherein numerous smaller data centers look like a community of Vudu devices?
Going forward in a network world of usage caps, distributed storage and parallel processing I know what element that will need to solved and that is service levels. Commercial and consumer end-users want to get what they are paying for and service providers do not want to be taken advantage of in the services they are offering. Security and defined service level agreements will push down to the individual user just as search and content is being optimized and directed to the group of one from broader demographic groups. Why are their wireless broadband usage caps? Because there is a spectrum problem and that same problem and solution set is coming to your FTTH connection sooner than you think. Why do you think Google and Amazon are charging for compute transactions? Anyone who used a mainframe in the days when you had to sign up for processing time is having flash back and wonder what happened to all the punch cards.
The reason Google and Amazon charge for compute is because transactions cost money. The whole digital content revolution, disintermediation, OTT video, music downloads, YT, blah, blah, whatever you want to call it goes back to one force and that is transaction economics. The profit margin that can be derived from selling content has and continues to decline. Distribution is easier; hence the transaction cost is smaller, resulting in a lower profit margin thus supporting fewer intermediaries. It does not mean the cost is zero, it just means less. What is the cost to store, compute and transmit content? Answer that, add some profit margin at each step and you know the future.
The companies who are providing the tools and systems that provide the analytics around the economic transaction of store, compute and transmit (SCT) are going to be big winners.
** It is all about the network stupid, because it is all about compute. **