Compact System

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Wednesday, 27 January 2010

Research Areas of Interest: Building scalable, robust cluster applications

Posted on 07:30 by Unknown
Posted by Brad Chen, Technical Lead/Manager

As part of our series on research areas of interest to Google, we discuss some important areas relating to cluster applications in distributed systems. In the last two decades distributed systems have undergone a metamorphosis from academic curiosities to the foundation of an entire industry. Despite these successes, at Google we see distributed systems as a technology in its infancy, with huge gaps in the supporting research (some examples here and here) that represent some of the most important problems in the space. Here are some examples:
  • Resource sharing: Stranded resources like idle memory, CPU, and disk bandwidth represent huge capital and operating expenses that deliver no business value. A cluster system based upon the best published research would be likely to leave 50% or more of hardware resources idle. We encourage researchers to explore hardware/software architectures that facilitate more supple sharing to avoid stranded and underutilized computational resources.
  • Balancing cost, performance, and reliability: Current cluster applications tend to be excessively rigid and brittle, offering only coarse controls to tune the balance between reliability, performance and cost. We envision systems that allow cost to be optimized based on an input specification of performance and reliability requirements. An effective solution might allow service level settings to propagate downward through the layered structure of the system.
  • Self-maintaining systems: The level of expertise required to troubleshoot today's large systems is one of the biggest barriers to more and larger deployments. The published research in this area has at best marginally improved the need for such rare expertise. We envision systems that can adapt automatically to changing conditions, in which redundancy and multiple geographically distributed data centers simplify rather than complicate manageability. This will require breakthroughs in monitoring and data analysis to address the diversity of failure modes and simplify the task of keeping systems healthy.
Research in these areas will improve the current state of cluster applications enabling systems that are less expensive, easier to monitor, and can scale more efficiently.

Previous posts in the series: Mulitmedia
Read More
Posted in | No comments

Thursday, 7 January 2010

Google Cluster Data

Posted on 08:11 by Unknown
Posted by Joseph L. Hellerstein, Manager of Google Performance Analytics

Google faces a large number of technical challenges in the evolution of its applications and infrastructure. In particular, as we increase the size of our compute clusters and scale the work that they process, many issues arise in how to schedule the diversity of work that runs on Google systems.

We have distilled these challenges into the following research topics that we feel are interesting to the academic community and important to Google:
  • Workload characterizations: How can we characterize Google workloads in a way that readily generates synthetic work that is representative of production workloads so that we can run stand alone benchmarks?
  • Predictive models of workload characteristics: What is normal and what is abnormal workload? Are there "signals" that can indicate problems in a time-frame that is possible for automated and/or manual responses?
  • New algorithms for machine assignment: How can we assign tasks to machines so that we make best use of machine resources, avoid excess resource contention on machines, and manage power efficiently?
  • Scalable management of cell work: How should we design the future cell management system to efficiently visualize work in cells, to aid in problem determination, and to provide automation of management tasks?
To aid researchers in addressing these questions in a realistic manner, we will provide data from Google production systems. The initial focus of these data will be workload characterization. Details of the data can be found here. The data are structured as follows:
  • Time (int) - time in seconds since the start of data collection
  • JobID (int) - Unique identifier of the job to which this task belongs
  • TaskID (int) - Unique identifier of the executing task
  • Job Type (0, 1, 2, 3) - class of job (a categorization of work)
  • Normalized Task Cores (float) - normalized value of the average number of cores used by the task
  • Normalized Task Memory (float) - normalized value of the average memory consumed by the task
We solicit your feedback in terms of: (a) the quality and content of the data we are providing; (b) technical approaches and/or results related to the topics above; and (c) other research topics that you feel Google should be addressing in the area of Cloud Computing (along with details of the data required to address these topics).
Read More
Posted in | No comments
Newer Posts Older Posts Home
Subscribe to: Posts (Atom)

Popular Posts

  • CDC Birth Vital Statistics in BigQuery
    Posted by Dan Vanderkam, Software Engineer Google’s BigQuery Service lets enterprises and developers crunch large-scale data sets quickly...
  • Our Unique Approach to Research
    Posted by  Alfred Spector , Vice President of Research and Special Initiatives Google started as a research project —and research has remain...
  • Google, the World Wide Web and WWW conference: years of progress, prosperity and innovation
    Posted by Prabhakar Raghavan, Vice President of Engineering More than forty members of Google’s technical staff gathered in Lyon, France i...
  • Partnering with Tsinghua University to support education in Western China
    Posted by Aimin Zhu, China University Relations We’re excited to announce that we’ve teamed up with Tsinghua University to provide educatio...
  • Our Faculty Institute brings faculty back to the drawing board
    Posted by Nina Kim Schultz, Google Education Research Cross-posted with the Official Google Blog School may still be out for summer, but tea...
  • Site Reliability Engineers: “solving the most interesting problems”
    Posted by Chris Reid, Sydney Staffing team I recently sat down with Ben Appleton, a Senior Staff Software Engineer, to talk about his recent...
  • More Google Cluster Data
    Posted by John Wilkes, Principal Software Engineer Google has a strong interest in promoting high quality systems research, and we believe t...
  • Impact of Organic Ranking on Ad Click Incrementality
    Posted by David Chan, Statistician and Lizzy Van Alstine, Research Evangelist  In 2011, Google released a Search Ads Pause research study w...
  • Market Algorithms and Optimization Meeting
    Posted by  Vahab S. Mirrokni and Muthu Muthukrishnan Google auctions ads, and enables a market with millions of advertisers and users.  This...
  • Released Data Set: Features Extracted From YouTube Videos for Multiview Learning
    Posted by Omid Madani, Senior Software Engineer “If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a ...

Categories

  • accessibility
  • ACL
  • ACM
  • Acoustic Modeling
  • ads
  • adsense
  • adwords
  • Africa
  • Android
  • API
  • App Engine
  • App Inventor
  • Audio
  • Awards
  • Cantonese
  • China
  • Computer Science
  • conference
  • conferences
  • correlate
  • crowd-sourcing
  • CVPR
  • datasets
  • Deep Learning
  • distributed systems
  • Earth Engine
  • economics
  • Education
  • Electronic Commerce and Algorithms
  • EMEA
  • EMNLP
  • entities
  • Exacycle
  • Faculty Institute
  • Faculty Summit
  • Fusion Tables
  • gamification
  • Google Books
  • Google+
  • Government
  • grants
  • HCI
  • Image Annotation
  • Information Retrieval
  • internationalization
  • Interspeech
  • jsm
  • jsm2011
  • K-12
  • Korean
  • Labs
  • localization
  • Machine Hearing
  • Machine Learning
  • Machine Translation
  • MapReduce
  • market algorithms
  • Market Research
  • ML
  • MOOC
  • NAACL
  • Natural Language Processing
  • Networks
  • Ngram
  • NIPS
  • NLP
  • open source
  • operating systems
  • osdi
  • osdi10
  • patents
  • ph.d. fellowship
  • PiLab
  • Policy
  • Public Data Explorer
  • publication
  • Publications
  • renewable energy
  • Research Awards
  • resource optimization
  • Search
  • search ads
  • Security and Privacy
  • SIGMOD
  • Site Reliability Engineering
  • Speech
  • statistics
  • Structured Data
  • Systems
  • Translate
  • trends
  • TV
  • UI
  • University Relations
  • UNIX
  • User Experience
  • video
  • Vision Research
  • Visiting Faculty
  • Visualization
  • Voice Search
  • Wiki
  • wikipedia
  • WWW
  • YouTube

Blog Archive

  • ►  2013 (51)
    • ►  December (3)
    • ►  November (9)
    • ►  October (2)
    • ►  September (5)
    • ►  August (2)
    • ►  July (6)
    • ►  June (7)
    • ►  May (5)
    • ►  April (3)
    • ►  March (4)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (59)
    • ►  December (4)
    • ►  October (4)
    • ►  September (3)
    • ►  August (9)
    • ►  July (9)
    • ►  June (7)
    • ►  May (7)
    • ►  April (2)
    • ►  March (7)
    • ►  February (3)
    • ►  January (4)
  • ►  2011 (51)
    • ►  December (5)
    • ►  November (2)
    • ►  September (3)
    • ►  August (4)
    • ►  July (9)
    • ►  June (6)
    • ►  May (4)
    • ►  April (4)
    • ►  March (5)
    • ►  February (5)
    • ►  January (4)
  • ▼  2010 (44)
    • ►  December (7)
    • ►  November (2)
    • ►  October (9)
    • ►  September (7)
    • ►  August (2)
    • ►  July (7)
    • ►  June (3)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ▼  January (2)
      • Research Areas of Interest: Building scalable, ro...
      • Google Cluster Data
  • ►  2009 (44)
    • ►  December (8)
    • ►  November (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (5)
    • ►  May (4)
    • ►  April (6)
    • ►  March (3)
    • ►  February (1)
    • ►  January (4)
  • ►  2008 (11)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  July (1)
    • ►  May (3)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
  • ►  2007 (9)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  February (2)
  • ►  2006 (15)
    • ►  December (1)
    • ►  November (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
Powered by Blogger.

About Me

Unknown
View my complete profile