Compact System

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 20 May 2008

Machine Learning Meeting

Posted on 17:32 by Unknown
Posted by Phil Long

Machine Learning is a branch of Artificial Intelligence in which, naturally enough, the aim is to get computers to learn: things like improving performance over time, and recognizing general tendencies among a number of specific cases.  We have many ways to exploit Machine Learning programs, and a lot of data to give them.  Machine Learning helps us to estimate what content users like most, what content is even legitimate, and how to match ads to content.  It also plays key roles in products such as Google Translate and 1-800-GOOG-411.

Last week, Google held a Machine Learning Summit in New York, gathering together engineers from around the world, including participants from Beijing, Zurich, Boston, Haifa, and Mountain View. The program included invited talks by CJ Lin and Peng Xu, and many shorter presentations by Googlers on topics including broadly applicable Machine Learning software, especially challenging applications, and some mathematical analysis.
Read More
Posted in | No comments

Tuesday, 6 May 2008

Can You Publish at Google?

Posted on 15:10 by Unknown
Posted by Rich Gossweiler

As part of the interview process at Google we try to describe what it is like to do research here. A common question I get is "How hard is it to publish at Google?" I want to dispel the myth that it is hard. It is easy to publish, easy to put code into open source, easy to give talks, etc. But it is also easy for great research to become great engineering, and that is an incredible lure. Great barriers of despair exist between research and development at many companies; researchers can find it hard to have impact beyond demos, papers, or patents.

Here at Google it is not uncommon for researchers to work on products (and is modus operandi for me); you can come up with something interesting, experiment to convince yourself and others that it is worthwhile, and then work as part of the team to build it and help create world-changing products. But are you willing to do that? That decision is the hard part, not publishing a paper.

I think from a Google standpoint, we need to make sure these barriers don't form, that making products, experimenting and having a venue for trying bold new approaches continues to be part of the culture.

Read More
Posted in | No comments

Thursday, 1 May 2008

VisualRank

Posted on 13:30 by Unknown
Posted by Shumeet Baluja and Yushi Jing

At WWW-2008, in Beijing, China, we presented our paper "PageRank for Product Image Search". In this paper, we presented a system that used visual cues, instead of solely text information, to determine the rank of images. The idea was simple: find common visual themes in a set of images, and then find a small set of images that best represented those themes. The resulting algorithm wound up being PageRank, but on an entirely inferred graph of image similarities. Since the release of the paper, we've noticed lots of coverage in the press and have received quite a few questions. We thought we could answer a few of them here.


"Why did we choose to use products for our test case?" First and foremost, product queries are popular in actual usage; addressing them is important. Second, users have strong expectations of what results we should return for these queries; therefore, this category provides an important set of examples that we need to address especially carefully. Third, on a pragmatic note, they lend themselves well to the type of "image features" that we selected in this study. Since the publication of the paper, we've also extended our results to other query types, including travel-related queries. One of the nice features of the approach is that (we hope) it will be easy to extend to new domains; as research in measuring image or object similarity continues, the advances can easily be incorporated into the similarity calculation to compute the underlying graph; the computations on the graph do not change.

"Where are we going from here?" Besides broadening the sets of queries (and sets of features) for which we can use this approach, there are three directions we're exploring. First, estimating similarity measures for all of the images on the web is computationally expensive; approximations or alternative computations are needed. Second, we hope to evaluate our approach with respect to the large number of recently proposed alternative clustering methods. Third, many variations of PageRank can be used in quite interesting ways for image search. For example, we can use some of these previously published methods to reintroduce, in a meaningful manner, the textual information that the VisualRank algorithm removed. In the end, we have an approach that has an easy integration with both text and visual clues. Stay tuned for more on that in the coming months.

And now to answer the most commonly asked question, "Is it live?" Not yet. Currently, it is research in progress (click here to help speed up the process). In the meantime, though, if you'd like another sneak peek of our research on large graphs, this time in the context of YouTube datamining, just follow the link.

Finally, we want to extend our deepest thanks to the people who helped on this project, especially the image-search team; without their help, this research would not have been possible.
Read More
Posted in | No comments
Newer Posts Older Posts Home
Subscribe to: Comments (Atom)

Popular Posts

  • CDC Birth Vital Statistics in BigQuery
    Posted by Dan Vanderkam, Software Engineer Google’s BigQuery Service lets enterprises and developers crunch large-scale data sets quickly...
  • Towards Energy-Proportional Datacenters
    Posted by Dennis Abts, Michael R. Marty, Philip M. Wells, Peter Klausler, and Hong Liu This is part of the series highlighting some notable...
  • Site Reliability Engineers: “solving the most interesting problems”
    Posted by Chris Reid, Sydney Staffing team I recently sat down with Ben Appleton, a Senior Staff Software Engineer, to talk about his recent...
  • Our Faculty Institute brings faculty back to the drawing board
    Posted by Nina Kim Schultz, Google Education Research Cross-posted with the Official Google Blog School may still be out for summer, but tea...
  • Market Algorithms and Optimization Meeting
    Posted by  Vahab S. Mirrokni and Muthu Muthukrishnan Google auctions ads, and enables a market with millions of advertisers and users.  This...
  • Our Unique Approach to Research
    Posted by  Alfred Spector , Vice President of Research and Special Initiatives Google started as a research project —and research has remain...
  • Impact of Organic Ranking on Ad Click Incrementality
    Posted by David Chan, Statistician and Lizzy Van Alstine, Research Evangelist  In 2011, Google released a Search Ads Pause research study w...
  • Large-scale graph computing at Google
    Posted by Grzegorz Czajkowski, Systems Infrastructure Team If you squint the right way, you will notice that graphs are everywhere. For exam...
  • Continuing the quest for future computer scientists with CS4HS
    Erin Mindell, Program Manager, Google Education Computer Science for High School (CS4HS) began five years ago with a simple question: How c...
  • Millions of Core-Hours Awarded to Science
    Posted by Andrea Held, Program Manager, University Relations In 2011 Google University Relations launched a new academic research awards pr...

Categories

  • accessibility
  • ACL
  • ACM
  • Acoustic Modeling
  • ads
  • adsense
  • adwords
  • Africa
  • Android
  • API
  • App Engine
  • App Inventor
  • Audio
  • Awards
  • Cantonese
  • China
  • Computer Science
  • conference
  • conferences
  • correlate
  • crowd-sourcing
  • CVPR
  • datasets
  • Deep Learning
  • distributed systems
  • Earth Engine
  • economics
  • Education
  • Electronic Commerce and Algorithms
  • EMEA
  • EMNLP
  • entities
  • Exacycle
  • Faculty Institute
  • Faculty Summit
  • Fusion Tables
  • gamification
  • Google Books
  • Google+
  • Government
  • grants
  • HCI
  • Image Annotation
  • Information Retrieval
  • internationalization
  • Interspeech
  • jsm
  • jsm2011
  • K-12
  • Korean
  • Labs
  • localization
  • Machine Hearing
  • Machine Learning
  • Machine Translation
  • MapReduce
  • market algorithms
  • Market Research
  • ML
  • MOOC
  • NAACL
  • Natural Language Processing
  • Networks
  • Ngram
  • NIPS
  • NLP
  • open source
  • operating systems
  • osdi
  • osdi10
  • patents
  • ph.d. fellowship
  • PiLab
  • Policy
  • Public Data Explorer
  • publication
  • Publications
  • renewable energy
  • Research Awards
  • resource optimization
  • Search
  • search ads
  • Security and Privacy
  • SIGMOD
  • Site Reliability Engineering
  • Speech
  • statistics
  • Structured Data
  • Systems
  • Translate
  • trends
  • TV
  • UI
  • University Relations
  • UNIX
  • User Experience
  • video
  • Vision Research
  • Visiting Faculty
  • Visualization
  • Voice Search
  • Wiki
  • wikipedia
  • WWW
  • YouTube

Blog Archive

  • ►  2013 (51)
    • ►  December (3)
    • ►  November (9)
    • ►  October (2)
    • ►  September (5)
    • ►  August (2)
    • ►  July (6)
    • ►  June (7)
    • ►  May (5)
    • ►  April (3)
    • ►  March (4)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (59)
    • ►  December (4)
    • ►  October (4)
    • ►  September (3)
    • ►  August (9)
    • ►  July (9)
    • ►  June (7)
    • ►  May (7)
    • ►  April (2)
    • ►  March (7)
    • ►  February (3)
    • ►  January (4)
  • ►  2011 (51)
    • ►  December (5)
    • ►  November (2)
    • ►  September (3)
    • ►  August (4)
    • ►  July (9)
    • ►  June (6)
    • ►  May (4)
    • ►  April (4)
    • ►  March (5)
    • ►  February (5)
    • ►  January (4)
  • ►  2010 (44)
    • ►  December (7)
    • ►  November (2)
    • ►  October (9)
    • ►  September (7)
    • ►  August (2)
    • ►  July (7)
    • ►  June (3)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ►  2009 (44)
    • ►  December (8)
    • ►  November (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (5)
    • ►  May (4)
    • ►  April (6)
    • ►  March (3)
    • ►  February (1)
    • ►  January (4)
  • ▼  2008 (11)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  July (1)
    • ▼  May (3)
      • Machine Learning Meeting
      • Can You Publish at Google?
      • VisualRank
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
  • ►  2007 (9)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  February (2)
  • ►  2006 (15)
    • ►  December (1)
    • ►  November (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
Powered by Blogger.

About Me

Unknown
View my complete profile