Compact System

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Wednesday, 18 March 2009

And the award goes to...

Posted on 09:22 by Unknown
Posted by Fernando Pereira, Research Director

Corinna Cortes, Head of Google Research in New York, has just been awarded the ACM Paris Kanellakis Theory and Practice Award jointly with Vladimir Vapnik (Royal Holloway College and NEC Research). The award recognizes their invention in the early 1990s of the soft-margin support vector machine, which has become the supervised machine learning method of choice for applications ranging from image analysis to document classification to bioinformatics.

What is so important about this invention? In supervised machine learning, we create algorithms that can learn a rule to accurately classify new examples based on a set of training examples (e.g. spam or non-spam). There is no single attribute of an email message that tells us with certainty that it is spam. Instead, many attributes have to be considered, forming a vector of very high dimension. The same situation arises in many other machine practical learning tasks, including many that we work on at Google.

To learn accurate classifiers, we need to solve several big problems. First, the rule learned from the training data should be accurate on new test examples, even though it has not seen those examples. In other words, the rule must generalize well. Second, we must be able to find the optimal rule efficiently. Both of these problems are especially daunting for very high dimensional data. Third, the method for computing the rule should be able to accommodate errors in the training data, such as messages that are given conflicting labels by different people (my spam may be your ham).

Soft-margin support vector machines wrap these three problems together into an elegant mathematical package. The crucial insight is that classification problems of this kind can be expressed as finding in very high dimension (or even infinite dimension) the hyperplane that best separates the positive examples (ham) from the negative ones (spam).

Remarkably, the solution of this problem does not depend on the dimensionality of the data, it depends only on the pairwise similarities between the training examples determined by the agreement or disagreement between corresponding attributes. Furthermore, a hyperplane that separates the training data well can be shown to generalize well to unseen data with the same statistical properties.

Now, you might be asking how could this be done if the training data is inconsistently labeled. After all, you cannot have the same example on both sides of the separating hyperplane. That's where the soft margin idea comes in: the quadratic optimization program that finds the optimal separating hyperplane can be cleverly modified to "give up" on a fraction of the training examples that cannot be classified correctly.

With this crucial improvement, support vector machines became really practical, while the core ideas have had huge influence in the development of further learning algorithms for an ever wider range of tasks.

Congratulations to Corinna (and Vladimir) on the well-deserved award.
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Our Faculty Institute brings faculty back to the drawing board
    Posted by Nina Kim Schultz, Google Education Research Cross-posted with the Official Google Blog School may still be out for summer, but tea...
  • Academic Successes in Cluster Computing
    Posted by Alfred Spector, VP of Research Access to massive computing resources is foundational to Research and Development. Fifteen awardees...
  • Towards Energy-Proportional Datacenters
    Posted by Dennis Abts, Michael R. Marty, Philip M. Wells, Peter Klausler, and Hong Liu This is part of the series highlighting some notable...
  • CDC Birth Vital Statistics in BigQuery
    Posted by Dan Vanderkam, Software Engineer Google’s BigQuery Service lets enterprises and developers crunch large-scale data sets quickly...
  • International Conference on Machine Learning (ICML 2009) in Montreal
    Posted by Eyal Even Dar and Vahab Mirrokni , Google Research, NY The 26th International Conference on Machine Learning ( ICML 2009 ) was re...
  • Market Algorithms and Optimization Meeting
    Posted by  Vahab S. Mirrokni and Muthu Muthukrishnan Google auctions ads, and enables a market with millions of advertisers and users.  This...
  • Google launches Korean Voice Search
    Posted by Mike Schuster & Martin Jansche, Google Research On June 16th, we launched our Korean voice search system . Google Search by Vo...
  • A new landmark in computer vision
    Posted by Jay Yagnik, Head of Computer Vision Research [Cross-posted with the Official Google Blog ] Science fiction books and movies have l...
  • Focusing on Our Users: The Google Health Redesign
    Posted by Hendrik Mueller, User Experience Researcher When I relocated to New York City a few years ago, some of the most important health i...
  • Impact of Organic Ranking on Ad Click Incrementality
    Posted by David Chan, Statistician and Lizzy Van Alstine, Research Evangelist  In 2011, Google released a Search Ads Pause research study w...

Categories

  • accessibility
  • ACL
  • ACM
  • Acoustic Modeling
  • ads
  • adsense
  • adwords
  • Africa
  • Android
  • API
  • App Engine
  • App Inventor
  • Audio
  • Awards
  • Cantonese
  • China
  • Computer Science
  • conference
  • conferences
  • correlate
  • crowd-sourcing
  • CVPR
  • datasets
  • Deep Learning
  • distributed systems
  • Earth Engine
  • economics
  • Education
  • Electronic Commerce and Algorithms
  • EMEA
  • EMNLP
  • entities
  • Exacycle
  • Faculty Institute
  • Faculty Summit
  • Fusion Tables
  • gamification
  • Google Books
  • Google+
  • Government
  • grants
  • HCI
  • Image Annotation
  • Information Retrieval
  • internationalization
  • Interspeech
  • jsm
  • jsm2011
  • K-12
  • Korean
  • Labs
  • localization
  • Machine Hearing
  • Machine Learning
  • Machine Translation
  • MapReduce
  • market algorithms
  • Market Research
  • ML
  • MOOC
  • NAACL
  • Natural Language Processing
  • Networks
  • Ngram
  • NIPS
  • NLP
  • open source
  • operating systems
  • osdi
  • osdi10
  • patents
  • ph.d. fellowship
  • PiLab
  • Policy
  • Public Data Explorer
  • publication
  • Publications
  • renewable energy
  • Research Awards
  • resource optimization
  • Search
  • search ads
  • Security and Privacy
  • SIGMOD
  • Site Reliability Engineering
  • Speech
  • statistics
  • Structured Data
  • Systems
  • Translate
  • trends
  • TV
  • UI
  • University Relations
  • UNIX
  • User Experience
  • video
  • Vision Research
  • Visiting Faculty
  • Visualization
  • Voice Search
  • Wiki
  • wikipedia
  • WWW
  • YouTube

Blog Archive

  • ►  2013 (51)
    • ►  December (3)
    • ►  November (9)
    • ►  October (2)
    • ►  September (5)
    • ►  August (2)
    • ►  July (6)
    • ►  June (7)
    • ►  May (5)
    • ►  April (3)
    • ►  March (4)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (59)
    • ►  December (4)
    • ►  October (4)
    • ►  September (3)
    • ►  August (9)
    • ►  July (9)
    • ►  June (7)
    • ►  May (7)
    • ►  April (2)
    • ►  March (7)
    • ►  February (3)
    • ►  January (4)
  • ►  2011 (51)
    • ►  December (5)
    • ►  November (2)
    • ►  September (3)
    • ►  August (4)
    • ►  July (9)
    • ►  June (6)
    • ►  May (4)
    • ►  April (4)
    • ►  March (5)
    • ►  February (5)
    • ►  January (4)
  • ►  2010 (44)
    • ►  December (7)
    • ►  November (2)
    • ►  October (9)
    • ►  September (7)
    • ►  August (2)
    • ►  July (7)
    • ►  June (3)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ▼  2009 (44)
    • ►  December (8)
    • ►  November (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (5)
    • ►  May (4)
    • ►  April (6)
    • ▼  March (3)
      • The Unreasonable Effectiveness of Data
      • Google and WPP Marketing Research Awards: Improvin...
      • And the award goes to...
    • ►  February (1)
    • ►  January (4)
  • ►  2008 (11)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  July (1)
    • ►  May (3)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
  • ►  2007 (9)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  February (2)
  • ►  2006 (15)
    • ►  December (1)
    • ►  November (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
Powered by Blogger.

About Me

Unknown
View my complete profile