No Free Lunch- Not Even a Data Snack

Exactly thirty years ago, a small quiz appeared in Co-Evolution Quarterly, with the hip sounding, slang trumps grammar name of “Where You At”. Check out these quiz questions: “How many days until the full moon? Can you name five resident and five migratory bird species in your area? Can you name the soil series you are standing upon? Can you trace your water from source to tap? Where does your garbage go?” This quiz captured a sea change in modern environmentalism; the sea change that declares that single issues and single actions are never enough, the sea change that inspires many of us to live our lives with fuller ecological integrity, spiritual depth and with increased attention to our local neighborhoods.

In 1981, the personal computer was still just a babe, with no Microsoft software or internet yet available. Today, in the spirit of “Where You At” I’d like to add a new question to the quiz: “Do you know where your data goes once you hit the enter button?” Is there a life cycle to trace? What kind of materials and how much energy is used that allows for all of our searches, emails, tweets, facebook updates, video watching, music listening, document sharing and online banking? Hold on to your chairs- this short tour might be a white knuckle experience if this is your first close up look at this new kind of factory called datacenters.

Where should I begin? Once, in a galaxy far, far away, giant computers called mainframes, as big as an entire room, handled the computing needs for the largest and busiest of universities. Er, maybe I won’t go back that far. How about we begin with a tour of a data center? Imagine entering a warehouse (some as large as a football field) and the first thing you will notice are aisles with rows and rows of servers stacked in racks. Bring layers of clothing, because to control the climate, the aisles where the fronts of the servers are to the right and left can be 55-65 degrees cool and the aisles where the backs of the servers are to the right and left can be heat sinks of 85-95 degrees. An Emerson Network Power (ENP) survey estimates the world is populated with 509,147 of these newish datacenters taking up the space of 5,955 football fields. Estimate is the key word because many companies do not disclose the number of datacenters they use. For instance the largest of users, Google, Amazon, Microsoft and many others do not share this information.

The electric company’s dream customer is a datacenter. Every datacenter is not just the servers, but all of the climate control equipment, power supply regulators, backup machines, smoke and fire detection sensors, security apparatus and ordinary overhead lights need also need electricity 24/7. The EPA estimates that half of all the energy used in a datacenter is for the servers and the other half is all the air conditioners and the other support equipment.

The EPA reports that data center energy consumption doubled from 2000-2007 (3.5 gw to 7 gw). The potential for rapid doubling growth in this sector is causing attention to be paid. During this time, datacenters have jumped to consume 1.5 % of all energy use worldwide and 2% of energy use in the United States.

Emerson Network Power aggregated amazing statistics about our data use. Ready? 2011 will see 1.2 trillion gigabytes of data consumed, or 7 million DVD’s every hour. The years worth of data could fill 75 billion (16gb) IPods, enough for 10 IPods for every person on earth. 1,157 people start a YouTube video every second which equals 100,000,000 videos shown each day. In Feb 2011, 140 million tweets were tweeted each day- which is almost 3x’s the 2010 daily tweet average.

Can I keep going? Here is a group of surreal numbers generated by Google. Just let them waft over you- whether you get chills like from the cool aisle of a datacenter, or the sweats like when you are in the hot aisle of a datacenter, unavoidably, the devilish details live in the numbers. Google reports that they used 2,259,998 Megawatt hours in 2010. (I checked this number several times). They also estimate the Co2 emitted from each search. Any guesses on the carbon footprint of your search journey through a dozen servers (or so) and back again? They know that one search is equivalent to two minutes of a YouTube video which is equivalent to 0.2g Co2 being emitted. In other words, for every 100 searches is equivalent to 20 grams Co2 emitted which is equivalent to using our laptops for 60 minutes which is equivalent to watching YouTube for 3 hours and 20 minutes. As invisible and effortless as my Star Trekkian computer searches seem to me, Google is confirming that a very real impact is being made in the world. There is no free lunch- not even a free data snack.

Let’s take a quick break for a primer to compare a kilowatt versus a megawatt versus a gigawatt. Remember that 1 thousand seconds (kilosecond) = 16.7 minutes; 1 million seconds (megaseconds) = 11.6 days; and 1billion seconds (gigasecond) = 31.7 years (from my workshop 15 Billion Years and Six Days http://www.maggiddavid.net/environmental-education/outdoor-education/ )

We are now coming towards the home stretch. Where is all of this leading? For the near future, there is the need for biggering, biggering and more biggering. Even with the reducing impact of the recession, energy efficiency, and computing capacity increases (ENP details a server grew 45x in computing power from 2001 through 2011). Listen to these surveys. Computer world reports that 36% of all datacenters say they will run out of data center space in the coming year. Of these, 40% plan on building new, 29% plan on leasing and 20% said they will rent from a “cloud” provider (meaning renting from another company that is managing a very real datacenter for them). Another survey of 300 IT decision makers with budgets of 1 billion dollars or 5000 employees, reports that 85% of these companies will definitely or probably expand their datacenters in this coming year.

These very real challenges are also leading to new levels of creativity to reduce electricity costs. Aggressive efficiency measures are being tested and encouraged. These include: New generation of computer designs reduce the need for electricity (for instance, newest generation central processing units (chips use 30% of all server energy), mother boards, etc. continue to reduce the need for electricity); Appropriately sized redundancy capabilities and power management tools- including power down technology and motion sensors for overhead lights (hardly rocket science); and new climate control systems that cool with fresh outdoor air instead of air conditioners. In addition, new cooling regulations will soon be coming out, updating and increasing the temperature range that servers can survive in. As these factories put their attention to saving energy and lessening their impact- much gain can be made. It is hard to remember that datacenters are so new, best practices are still being written. All the more so for green best practices. LEED, the national Leadership in Energy and Environmental Design program just began the process of drafting LEED for datacenters in 2009.

This is just the beginning of a response to the question “where does our data go?” I hope this abbreviated “geek tour” through the land of data brings close what previously might have been far. My hope is that this column might expand our mindfulness and appreciation for our ever present computer use. In addition, I dedicate this for all of us who are teachers, helping to expand the story we tell about our many and diverse connections to our home planet.


Select Sources:

Original “Where You At” quiz: http://www.dlackey.org/weblog/docs/Where%20You%20At.htm

Greening datacenters- overview:

http://www.computerworld.com/s/article/9017398/Seven_steps_to_a_green_data_center?taxonomyId=154&pageNumber=1

http://www.csemag.com/industry-news/more-top-stories/single-article/6-steps-to-better-data-centers/fedb0b6440.html

The EPA Report that first assessed the scope and size of datacenters from 2007 http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf

A Greenpeace report detailing the environmental impacts of datacenters: http://www.greenpeace.org/international/Global/international/publications/climate/2011/Cool%20IT/dirty-data-report-greenpeace.pdf

The datacenter innovators:

The Green Grid- a consortium of major companies setting up new datacenter practices. http://www.thegreengrid.org/

LEED; Leadership for Energy and Environmental Design for datacenters http://www.datacenterknowledge.com/leed-platinum-data-centers/

Basic primer on power and energy http://www.energylens.com/articles/kw-and-kwh

Google's Green Statistics:

http://www.google.com/green/index.html

http://www.google.com/green/the-big-picture/references.html

http://www.google.com/green/the-big-picture.html#/intro/infographics-3

Emerson Network Power summary report with summary infographic:

http://www.datacenterknowledge.com/archives/2011/12/14/how-many-data-centers-emerson-says-500000/

http://www.emersonnetworkpower.com/en-US/About/NewsRoom/Pages/2011DataCenterState.aspx

Background articles:

http://www.nytimes.com/2011/08/01/technology/data-centers-using-less-power-than-forecast-report-says.html?_r=2

New York Times update on datacenter growth. Includes the fallacy that modeling scenarios equals fortune telling predictions. The original EPA report included 5 scenarios for datacenter electricity use growth- which is not reported in this article.

http://www.computerworld.com/s/article/9216841/Data_centers_under_strain_expand_at_furious_pace_

Computerworld surveys of datacenter growth

http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-most-web-servers/

An attempt to chart datacenters- though as mentioned, many big companies are not yet reporting. Check out the comments to get a sense of how big this landscape of datacenter owners is.



6 Replies to "No Free Lunch- Not Even a Data Snack"

  • Jesse Glickstein
    December 27, 2011 (1:58 pm)

    Very interesting post. Thanks for all the great resources!

  • Isaac Hametz
    December 28, 2011 (1:52 pm)

    David, it just so happens that this fall at the University of Virginia School of Architecture a professor name Michael Beaman is teaching a studio course on the design, use, and maintenance of Data Centers. He is specifically looking at the TJ Watson Research Center in Yorktown Heights, New York. This is course description:
    “In 1911 the Computing Tabulating Recording Corporation was founded, by 1922 it had become IBM. In 1985 at the age of 22 Garry Kasparov became the youngest World Chess Champion. Along with being an important year for chess, it was an important year for technology: First Compact Discs are released for public sale, First version of Windows released, Nintendo sales its first home gaming system, Steve Jobs resigns from Apple and stats NEXT (creating apple’s operating platform), The first domain name “symbolics.com” is created. Just 12 years later in 1997, the fates of Kasparov and IBM would be forever intertwined. Deep Blue, an IBM Super Computer, and Kasparov faced off in a series of six Chess matches – Kasparov lost. Deep Blue can calculate 200 million chess moves per second (approx. 11.38 GigaFLOPS) or 11, 380,000,000 point calculations per second. The fastest super computer currently being built (Fujitsu K) runs at 10.51 PetaFLOPS, or approximately 10,510, 000,000,000,000 calculations per second – about I million times faster than Deep Blue. Super Computers today take up entire buildings and require their own heating, cooling and ventilation infrastructure, support staff and maintenance technicians. Watson, IBM’s proto-AI computer which competed on and won Jeopardy in 2011 is small in comparison. Rather than taking an approach of massively parallel hardware, it ushered in an era of software development that more intelligently utilized computational capacity through contextual analysis. Deep blue crunched massive amounts of data to search for solutions to closed set problems, Watson on the other hand establishes conditional parameters to address open ended questions. Both of these approaches continue to be developed and employee thousands of scientist, programmers, researchers, and technicians worldwide. However, they currently have no implications on our understanding of the spatial and programmatic relationships they create. This studio will examine this situation. We will design for the impact these computers have on: those who work with and create these computers, the public who consume its information, the landscape that is created by their production and use and the buildings built to house, use and maintain them. We will address this situation through the creation of: contexts, contextonomies, and postscripts. In 1961, IBM moved from its research facility in a renovated house near Columbia University, which it had occupied since 1945, to the T.J. Watson Research Center in Yorktown Heights, New York. The center, designed by Eero Saarinen, is the home of advanced research including the development of the super computers Blue Gene, Deep Blue and Watson. This is our site. Through the creation of a comprehensive contextual analysis we will define our design problems.”
    What do you think?

  • Deborah Klee Wenger
    December 29, 2011 (5:32 am)

    Wow! I had no idea — thank you, David!

  • David Arfa
    December 29, 2011 (10:49 am)
  • David Arfa
    December 29, 2011 (10:49 am)

    Thanks for your comments- I’m learning as I go as well Deborah- Kind of amazing. Amazing how much is blind to us about our everyday lives.

    Jesse- enjoy these resources- they are just a start.

    Isaac- interesting class- are you taking it? It seems to focus on the “biggest and baddest” supercomputer around- different emphasis than your everyday run of the mill datacenters- but still great to explore all these implications. Often data centers have no windows or fresh air flow into them because people who maintain them are less valuable than the servers themselves. One ‘bright’ side to this is that new efficient fresh air cooling systems is good for the people too. As to this course, it sounds ambitious to cover the context of people making the computers along with the users in addition to the datacenter itself. The datacenter is just one part of the whole data life cycle. I’d be curious how much broader philosophy versus the focus of environmental impact will be covered in this course. Please keep us posted! Thanks. David

  • Isaac Hametz
    January 2, 2012 (12:16 pm)

    David, I am not taking the class (I chose to take a course that explores how decommissioned military sites can be reused as sites for ecological and social defense), however, a good friend of mine is and I will certainly keep you posted. In terms of your question about focus, the course is intended to explore the spatial dynamics and allied systems associated with information technology. It is not solely focused on environmental impact, but on the broader impact these centers have and can have on the networks they are connected to (including the people who maintain them, etc.). There are manifold reasons they chose the “big bad center” including the far reach the center has (ecologically, socially, historically, etc), the prestige of the original designer (Eero Saarinen), the location of the building (rural NY State), and I’m sure there are more.
    What drew you to ask questions about datacenters/information trails?


Got something to say?