Compact System

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 3 August 2010

Google North American Faculty Summit - cloud computing

Posted on 14:22 by Unknown
Posted by Brian Bershad, Director of Engineering, Site Director, Google Seattle

Of the three themes of our 2010 Faculty Summit, cloud computing was the one that pervaded all others, from security in the cloud to the presumption of cloud infrastructure behind the social web. But in our more focused discussion on cloud computing last Thursday, we started with the premise of “prodigiousness,” a concept introduced by Afred Spector, VP of Research and Special Initiatives.

While we all know that systems are huge and will get even huger, the implications of this size on programmability, manageability, power, etc. is hard to comprehend. Alfred noted that the Internet is predicted to be carrying a zetta-byte (1021 bytes) per year in just a few years. And growth in the number of processing elements per chip may give rise to warehouse computers of having 1010 or more processing elements. To use systems at this scale, we need new solutions for storage and computation. It was these solutions we focused on throughout our discussions.

In the plenary talk, Andrew Fikes spoke on storage system opportunities. Among many topics, he talked about shifting engineering foci to storage management and optimization not just on an individual cluster of co-located systems, but across geographically distributed clusters. The goal is so-called planetary-scale systems. This brings up all manner of diverse challenges ranging from the need to continually balance storage vs. transmission costs, the need to account for variable network latency characteristics, and the desire to optimize storage (e.g., by physically storing only one copy of a file that many feel they have rights to, or own).

We had a few roundtables in the afternoon for deeper discussions. In the table I led, we discussed two systems for “programming the data center” developed by systems researchers at Google Seattle/Kirkland. The first, Dremel, is a scalable, interactive ad-hoc query system for analysis of read-only nested databases. Dremel was recently presented in a paper at VLDB (Dremel: Interactive Analysis of Web-Scale Datasets, Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis. In Proceedings of the 36th Int'l Conf on Very Large Data Bases, 2010). The system serves as the foundational technology behind BigQuery, a product launched in limited preview mode at Google I/O in May.

We also discussed FlumeJava, a Java library that makes it easy to develop, test and run efficient data-parallel pipelines at data center scale. FlumeJava was developed by programming languages researchers at Google Seattle, and is currently in widespread use within Google. It was presented at the recent PLDI conference (FlumeJava: easy, efficient data-parallel pipelines, Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert R. Henry, Robert Bradshaw, Nathan Weizenbaum. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation). The work reflects Google’s commitment to programming language and compiler technologies at scale.

The field of data center programming has progressed substantially in the last 10 years. Dremel and FlumeJava systems represent abstractions of a higher level than the MapReduce construct we previously introduced, and we think they are easier to use (within their domain of applicability) and more automatically optimizable. With time, the field will discover new “instructions” and even better abstractions leading us to a point where computations which run on nearly unlimited processors can be expressed as easily as sequential programs. We are working hard to make progress here, and I look forward to reporting on our progress in the future.
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • CDC Birth Vital Statistics in BigQuery
    Posted by Dan Vanderkam, Software Engineer Google’s BigQuery Service lets enterprises and developers crunch large-scale data sets quickly...
  • Our Unique Approach to Research
    Posted by  Alfred Spector , Vice President of Research and Special Initiatives Google started as a research project —and research has remain...
  • Google, the World Wide Web and WWW conference: years of progress, prosperity and innovation
    Posted by Prabhakar Raghavan, Vice President of Engineering More than forty members of Google’s technical staff gathered in Lyon, France i...
  • Partnering with Tsinghua University to support education in Western China
    Posted by Aimin Zhu, China University Relations We’re excited to announce that we’ve teamed up with Tsinghua University to provide educatio...
  • Our Faculty Institute brings faculty back to the drawing board
    Posted by Nina Kim Schultz, Google Education Research Cross-posted with the Official Google Blog School may still be out for summer, but tea...
  • Site Reliability Engineers: “solving the most interesting problems”
    Posted by Chris Reid, Sydney Staffing team I recently sat down with Ben Appleton, a Senior Staff Software Engineer, to talk about his recent...
  • More Google Cluster Data
    Posted by John Wilkes, Principal Software Engineer Google has a strong interest in promoting high quality systems research, and we believe t...
  • Impact of Organic Ranking on Ad Click Incrementality
    Posted by David Chan, Statistician and Lizzy Van Alstine, Research Evangelist  In 2011, Google released a Search Ads Pause research study w...
  • Market Algorithms and Optimization Meeting
    Posted by  Vahab S. Mirrokni and Muthu Muthukrishnan Google auctions ads, and enables a market with millions of advertisers and users.  This...
  • Released Data Set: Features Extracted From YouTube Videos for Multiview Learning
    Posted by Omid Madani, Senior Software Engineer “If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a ...

Categories

  • accessibility
  • ACL
  • ACM
  • Acoustic Modeling
  • ads
  • adsense
  • adwords
  • Africa
  • Android
  • API
  • App Engine
  • App Inventor
  • Audio
  • Awards
  • Cantonese
  • China
  • Computer Science
  • conference
  • conferences
  • correlate
  • crowd-sourcing
  • CVPR
  • datasets
  • Deep Learning
  • distributed systems
  • Earth Engine
  • economics
  • Education
  • Electronic Commerce and Algorithms
  • EMEA
  • EMNLP
  • entities
  • Exacycle
  • Faculty Institute
  • Faculty Summit
  • Fusion Tables
  • gamification
  • Google Books
  • Google+
  • Government
  • grants
  • HCI
  • Image Annotation
  • Information Retrieval
  • internationalization
  • Interspeech
  • jsm
  • jsm2011
  • K-12
  • Korean
  • Labs
  • localization
  • Machine Hearing
  • Machine Learning
  • Machine Translation
  • MapReduce
  • market algorithms
  • Market Research
  • ML
  • MOOC
  • NAACL
  • Natural Language Processing
  • Networks
  • Ngram
  • NIPS
  • NLP
  • open source
  • operating systems
  • osdi
  • osdi10
  • patents
  • ph.d. fellowship
  • PiLab
  • Policy
  • Public Data Explorer
  • publication
  • Publications
  • renewable energy
  • Research Awards
  • resource optimization
  • Search
  • search ads
  • Security and Privacy
  • SIGMOD
  • Site Reliability Engineering
  • Speech
  • statistics
  • Structured Data
  • Systems
  • Translate
  • trends
  • TV
  • UI
  • University Relations
  • UNIX
  • User Experience
  • video
  • Vision Research
  • Visiting Faculty
  • Visualization
  • Voice Search
  • Wiki
  • wikipedia
  • WWW
  • YouTube

Blog Archive

  • ►  2013 (51)
    • ►  December (3)
    • ►  November (9)
    • ►  October (2)
    • ►  September (5)
    • ►  August (2)
    • ►  July (6)
    • ►  June (7)
    • ►  May (5)
    • ►  April (3)
    • ►  March (4)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (59)
    • ►  December (4)
    • ►  October (4)
    • ►  September (3)
    • ►  August (9)
    • ►  July (9)
    • ►  June (7)
    • ►  May (7)
    • ►  April (2)
    • ►  March (7)
    • ►  February (3)
    • ►  January (4)
  • ►  2011 (51)
    • ►  December (5)
    • ►  November (2)
    • ►  September (3)
    • ►  August (4)
    • ►  July (9)
    • ►  June (6)
    • ►  May (4)
    • ►  April (4)
    • ►  March (5)
    • ►  February (5)
    • ►  January (4)
  • ▼  2010 (44)
    • ►  December (7)
    • ►  November (2)
    • ►  October (9)
    • ►  September (7)
    • ▼  August (2)
      • Google North American Faculty Summit - Day 2
      • Google North American Faculty Summit - cloud compu...
    • ►  July (7)
    • ►  June (3)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ►  2009 (44)
    • ►  December (8)
    • ►  November (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (5)
    • ►  May (4)
    • ►  April (6)
    • ►  March (3)
    • ►  February (1)
    • ►  January (4)
  • ►  2008 (11)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  July (1)
    • ►  May (3)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
  • ►  2007 (9)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  February (2)
  • ►  2006 (15)
    • ►  December (1)
    • ►  November (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
Powered by Blogger.

About Me

Unknown
View my complete profile