Compact System

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 23 August 2012

Better table search through Machine Learning and Knowledge

Posted on 13:00 by Unknown
Posted By Johnny Chen, Product Manager, Google Research

The Web offers a trove of structured data in the form of tables. Organizing this collection of information and helping users find the most useful tables is a key mission of Table Search from Google Research. While we are still a long way away from the perfect table search, we made a few steps forward recently by revamping how we determine which tables are "good" (one that contains meaningful structured data) and which ones are "bad" (for example, a table that hold the layout of a Web page). In particular, we switched from a rule-based system to a machine learning classifier that can tease out subtleties from the table features and enables rapid quality improvement iterations. This new classifier is a support vector machine (SVM) that makes use of multiple kernel functions which are automatically combined and optimized using training examples. Several of these kernel combining techniques were in fact studied and developed within Google Research [1,2].

We are also able to achieve a better understanding of the tables by leveraging the Knowledge Graph. In particular, we improved our algorithms for identifying the context and topics of each table, the entities represented in the table and the properties they have. This knowledge not only helps our classifier make a better decision on the quality of the table, but also enables better matching of the table to the user query.

Finally, you will notice that we added an easy way for our users to import Web tables found through Table Search into their Google Drive account as Fusion Tables. Now that we can better identify good tables, the import feature enables our users to further explore the data. Once in Fusion Tables, the data can be visualized, updated, and accessed programmatically using the Fusion Tables API.

These enhancements are just the start. We are continually updating the quality of our Table Search and adding features to it.

Stay tuned for more from Boulos Harb, Afshin Rostamizadeh, Fei Wu, Cong Yu and the rest of the Structured Data Team.


[1] Algorithms for Learning Kernels Based on Centered Alignment
[2] Generalization Bounds for Learning Kernels
Email ThisBlogThis!Share to XShare to Facebook
Posted in Structured Data | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Towards Energy-Proportional Datacenters
    Posted by Dennis Abts, Michael R. Marty, Philip M. Wells, Peter Klausler, and Hong Liu This is part of the series highlighting some notable...
  • CDC Birth Vital Statistics in BigQuery
    Posted by Dan Vanderkam, Software Engineer Google’s BigQuery Service lets enterprises and developers crunch large-scale data sets quickly...
  • Market Algorithms and Optimization Meeting
    Posted by  Vahab S. Mirrokni and Muthu Muthukrishnan Google auctions ads, and enables a market with millions of advertisers and users.  This...
  • International Conference on Machine Learning (ICML 2009) in Montreal
    Posted by Eyal Even Dar and Vahab Mirrokni , Google Research, NY The 26th International Conference on Machine Learning ( ICML 2009 ) was re...
  • Site Reliability Engineers: “solving the most interesting problems”
    Posted by Chris Reid, Sydney Staffing team I recently sat down with Ben Appleton, a Senior Staff Software Engineer, to talk about his recent...
  • Two Views from the 2009 Google Faculty Summit
    Posted by Alfred Spector, Vice President of Research and Special Initiatives [cross-posted with the Official Google Blog ] We held our fifth...
  • Focusing on Our Users: The Google Health Redesign
    Posted by Hendrik Mueller, User Experience Researcher When I relocated to New York City a few years ago, some of the most important health i...
  • Supporting computer science education with CS4HS
    Posted by Terry Ednacot, Education Program Manager Recent statistics have shown a decline in the number of U.S. students taking computer sc...
  • Large-scale graph computing at Google
    Posted by Grzegorz Czajkowski, Systems Infrastructure Team If you squint the right way, you will notice that graphs are everywhere. For exam...
  • Our Faculty Institute brings faculty back to the drawing board
    Posted by Nina Kim Schultz, Google Education Research Cross-posted with the Official Google Blog School may still be out for summer, but tea...

Categories

  • accessibility
  • ACL
  • ACM
  • Acoustic Modeling
  • ads
  • adsense
  • adwords
  • Africa
  • Android
  • API
  • App Engine
  • App Inventor
  • Audio
  • Awards
  • Cantonese
  • China
  • Computer Science
  • conference
  • conferences
  • correlate
  • crowd-sourcing
  • CVPR
  • datasets
  • Deep Learning
  • distributed systems
  • Earth Engine
  • economics
  • Education
  • Electronic Commerce and Algorithms
  • EMEA
  • EMNLP
  • entities
  • Exacycle
  • Faculty Institute
  • Faculty Summit
  • Fusion Tables
  • gamification
  • Google Books
  • Google+
  • Government
  • grants
  • HCI
  • Image Annotation
  • Information Retrieval
  • internationalization
  • Interspeech
  • jsm
  • jsm2011
  • K-12
  • Korean
  • Labs
  • localization
  • Machine Hearing
  • Machine Learning
  • Machine Translation
  • MapReduce
  • market algorithms
  • Market Research
  • ML
  • MOOC
  • NAACL
  • Natural Language Processing
  • Networks
  • Ngram
  • NIPS
  • NLP
  • open source
  • operating systems
  • osdi
  • osdi10
  • patents
  • ph.d. fellowship
  • PiLab
  • Policy
  • Public Data Explorer
  • publication
  • Publications
  • renewable energy
  • Research Awards
  • resource optimization
  • Search
  • search ads
  • Security and Privacy
  • SIGMOD
  • Site Reliability Engineering
  • Speech
  • statistics
  • Structured Data
  • Systems
  • Translate
  • trends
  • TV
  • UI
  • University Relations
  • UNIX
  • User Experience
  • video
  • Vision Research
  • Visiting Faculty
  • Visualization
  • Voice Search
  • Wiki
  • wikipedia
  • WWW
  • YouTube

Blog Archive

  • ►  2013 (51)
    • ►  December (3)
    • ►  November (9)
    • ►  October (2)
    • ►  September (5)
    • ►  August (2)
    • ►  July (6)
    • ►  June (7)
    • ►  May (5)
    • ►  April (3)
    • ►  March (4)
    • ►  February (4)
    • ►  January (1)
  • ▼  2012 (59)
    • ►  December (4)
    • ►  October (4)
    • ►  September (3)
    • ▼  August (9)
      • Users love simple and familiar designs – Why websi...
      • Google at UAI 2012
      • Better table search through Machine Learning and K...
      • Machine Learning Book for Students and Researchers
      • Faculty Summit 2012: Online Education Panel
      • Improving Google Patents with European Patent Offi...
      • Teaching the World to Search
      • Speech Recognition and Deep Learning
      • Reflections on Digital Interactions: Thoughts from...
    • ►  July (9)
    • ►  June (7)
    • ►  May (7)
    • ►  April (2)
    • ►  March (7)
    • ►  February (3)
    • ►  January (4)
  • ►  2011 (51)
    • ►  December (5)
    • ►  November (2)
    • ►  September (3)
    • ►  August (4)
    • ►  July (9)
    • ►  June (6)
    • ►  May (4)
    • ►  April (4)
    • ►  March (5)
    • ►  February (5)
    • ►  January (4)
  • ►  2010 (44)
    • ►  December (7)
    • ►  November (2)
    • ►  October (9)
    • ►  September (7)
    • ►  August (2)
    • ►  July (7)
    • ►  June (3)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ►  2009 (44)
    • ►  December (8)
    • ►  November (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (5)
    • ►  May (4)
    • ►  April (6)
    • ►  March (3)
    • ►  February (1)
    • ►  January (4)
  • ►  2008 (11)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  July (1)
    • ►  May (3)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
  • ►  2007 (9)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  February (2)
  • ►  2006 (15)
    • ►  December (1)
    • ►  November (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
Powered by Blogger.

About Me

Unknown
View my complete profile