March 2013 ~ Compact System

Wednesday, 27 March 2013

Education Awards on Google App Engine

Posted on 10:00 by Unknown

Posted by Andrea Held, Google University Relations

Cross-posted with Google Developers Blog

Last year we invited proposals for innovative projects built on Google’s infrastructure. Today we are pleased to announce the 11 recipients of a Google App Engine Education Award. Professors and their students are using the award in cloud computing courses to study databases, distributed systems, web mashups and to build educational applications. Each selected project received $1000 in Google App Engine credits.

Awarding computational resources to classroom projects is always gratifying. It is impressive to see the creative ideas students and educators bring to these programs.
Below is a brief introduction to each project. Congratulations to the recipients!

John David N. Dionisio, Loyola Marymount University
Project description: The objective of this undergraduate database systems course is for students to implement one database application in two technology stacks, a traditional relational database and on Google App Engine. Students are asked to study both models and provide concrete comparison points.

Xiaohui (Helen) Gu, North Carolina State University
Project description: Advanced Distributed Systems Class
The goal of the project is to allow the students to learn distributed system concepts by developing real distributed system management systems and testing them on real world cloud computing infrastructures such as Google App Engine.

Shriram Krishnamurthi, Brown University
Project description: WeScheme is a programming environment that runs in the Web browser and supports interactive development. WeScheme uses App Engine to handle user accounts, serverside compilation, and file management.

Feifei Li, University of Utah
Project description: A graduate-level course that will be offered in Fall 2013 on the design and implementation of large data management system kernels. The objective is to integrate features from a relational database engine with some of the new features from NoSQL systems to enable efficient and scalable data management over a cluster of commodity machines.

Mark Liffiton, Illinois Wesleyan University
Project description: TeacherTap is a free, simple classroom-response system built on Google App Engine. It lets students give instant, anonymous feedback to teachers about a lecture or discussion from any computer or mobile device with a web browser, facilitating more adaptive class sessions.

Eni Mustafaraj, Wellesley College
Project description: Topics in Computer Science: Web Mashups. A CS2 course that combines Google App Engine and MIT App Inventor. Students will learn to build apps with App Inventor to collect data about their life on campus. They will use Google App Engine to build web services and apps to host the data and remix it to create web mashups. Offered in the 2013 Spring semester.

Manish Parashar, Rutgers University
Project description: Cloud Computing for Scientific Applications -- Autonomic Cloud Computing teaches students how a hybrid HPC/Grid + Cloud cyber infrastructure can be effectively used to support real-world science and engineering applications. The goal of our efforts is to explore application formulations, Cloud and hybrid HPC/Grid + Cloud infrastructure usage modes that are meaningful for various classes of science and engineering application workflows.

Orit Shaer, Wellesley College
Project description: GreenTouch
GreenTouch is a collaborative environment that enables novice users to engage in authentic scientific inquiry. It consists of a mobile user interface for capturing data in the field, a web application for data curation in the cloud, and a tabletop user interface for exploratory analysis of heterogeneous data.

Elliot Soloway, University of Michigan
Project description: WeLearn Mobile Platform: Making Mobile Devices Effective Tools for K-12. The platform makes mobile devices (Android, iOS, WP8) effective, essential tools for all-the-time, everywhere learning. WeLearn’s suite of productivity and communication apps enable learners to work collaboratively; WeLearn’s portal, hosted on Google App Engine, enables teachers to send assignments, review, and grade student artifacts. WeLearn is available to educators at no charge.

Jonathan White, Harding University
Project description: Teaching Cloud Computing in an Introduction to Engineering class for freshmen. We explore how well-designed systems are built to withstand unpredictable stresses, whether that system is a building, a piece of software or even the human body. The grant from Google is allowing us to add an overview of cloud computing as a platform that is robust under diverse loads.

Dr. Jiaofei Zhong, University of Central Missouri
Project description: By building an online Course Management System, students will be able to work on their team projects in the cloud. The system allows instructors and students to manage the course materials, including course syllabus, slides, assignments and tests in the cloud; the tool can be shared with educational institutions worldwide.

Posted in App Engine, University Relations | No comments

Wednesday, 13 March 2013

Scaling Computer Science Education

Posted on 11:15 by Unknown

Posted by Maggie Johnson, Director of Education and University Relations

Last week, I attended the annual SIGCSE (Special Interest Group, Computer Science Education) conference in Denver, CO. Google has been a platinum sponsor of SIGCSE for many years now, and the conference provides an opportunity for hundreds of computer science (CS) educators to share ideas and work on strategies to bring high quality CS education to K12 and undergraduate students.

Significant accomplishments over the last few years have laid a strong foundation for scaling CS curriculum, professional development (PD) and related programs in this country. The NSF has been funding curriculum and PD around the new CS Principles Advanced Placement course. The CSTA has published standards for K12 CS and a report on the limited extent to which schools, districts and states provide CS instruction to their students. CS Advocacy group, Computing in the Core, even provides a toolkit for communities to follow as they urge legislators for integration of Computer Science education into core K12 curriculum.

All of this work has made an impact, but there is still more to do.

I see our priorities in CS education to be ones of awareness and access. As CS educators, we must continue to raise awareness about the tremendous demand for jobs in the computing sector, and balance misconceptions with accurate data. Many students, parents, teachers and administrators remember the hype and disillusionment of the Dotcom period and myths on outsourcing and dwindling jobs yet the US Bureau of Labor Statistics (BLS) reports that ⅔ of all job growth in Science and Engineering will be in Computer Science employment over the next decade. (See 2010 BLS report here.) Clearing up this misconception is essential if we hope to satisfy US labor needs with recent graduates over the next several years.

Source: Gianchandani, Erwin. Revisiting ‘Where the Jobs Are’. The Computing Community Consortium Blog post on 23 May 2012. Link accessed on 8 March 2013.

Another misconception surrounds the range of CS-focused occupations that exist. The world of CS is expanding rapidly and we should celebrate the diversity of CS applications that are gaining momentum. Instead of the archetype of a sun-starved computer scientist, or software engineers working in isolation with little teamwork or communication opportunities, educators can encourage project-based learning, video game development, robotics, and graphic design as more concrete representations for abstract computational thinking.

Google believes that computing and CS are critical to our future, not only in the high tech sector, but for everyone. Our economy is becoming more and more dependent on technology-based solutions, which will require a future workforce with significant levels of CS knowledge and experience. In addition, we anticipate new career opportunities opening up in the next 3-5 years as more businesses move into the cloud and shift the way they run their IT departments.

Help us get the word out about the great opportunities in computing through organizations such as code.org, ACM, and NCWIT. Google is doing its part to support CS education and outreach through many programs including CS4HS, our Exploring Computational Thinking curriculum, and several student and teacher programs. So much opportunity, so little time!

Posted in Computer Science | No comments

Tuesday, 12 March 2013

Our Commitment to Social Computing Research: Social Interactions Focused Awards Announcement

Posted on 09:00 by Unknown

Ed H. Chi, Staff Research Scientist

Social interactions have always been an important part of the human experience. Social interaction research has shown results ranging from influences on our behavior from social networks [Aral2012] to our understanding of social belonging on health [Walton2011], as well as how conflicts and coordination play out in Wikipedia [Kittur2007]. Interestingly, social scientists have studied social interactions for many years, but it wasn’t until very recently that researchers can study these mechanisms through the explosion of services and data available on web-based social systems.

From information dissemination and the spread of innovation and ideas, to scientific discovery, we are seeing how a deep understanding of social interactions is affecting many different fields, such as health and education. For instance, scientists now have strong evidence that social interactions underlie many fundamental learning mechanisms starting from infancy well into adulthood [Meltzoff2009], and that peer discussions are critical in conceptual learning in college classes [Smith2009]. How might these learning science findings be built into social systems and products so that users maximize what they learn on the Web?

We know that interactions on the Web are diverse and people-centered. Google now enables social interactions to occur across many of our products, from Google+ to Search to YouTube. To understand the future of this socially connected web, we need to investigate fundamental patterns, design principles, and laws that shape and govern these social interactions.

We envision research at the intersection of disciplines including Computer Science, Human-Computer Interaction (HCI), Social Science, Social Psychology, Machine Learning, Big Data Analytics, Statistics and Economics. These fields are central to the study of how social interactions work, particularly driven by new sources of data, for example, open data sets from Web2.0 and social media sites, government databases, crowdsourcing, new survey techniques, and crisis management data collections. New techniques from network science and computational modeling, social network and sentiment analysis, application of statistical and machine learning, as well as theories from evolutionary theory, physics, and information theory, are actively being used in social interaction research.

We’re pleased to announce that Google has awarded over $1.2 million dollars to support the Social Interactions Research Awards, which are given to university research groups doing work in social computing and interactions. Research topics range from crowdsourcing, social annotations, a social media behavioral study, social learning, conversation curation, and scientific studies of how to start online communities.

We have awarded 15 researchers in 7 universities. We selected these proposals after a rigorous internal review. We believe the results will be broadly useful to product development and will further scientific research.

Joseph Konstan, Loren Terveen, and John Riedl from University of Minnesota. Precision Crowdsourcing: Closing the Loop to turn Information Consumers into Information Contributors.
Mor Naaman from Rutgers University, and Oded Nov from Polytechnic Institute of New York University. Examining the Impact of Social Traces on Page Visitors’ Opinions and Engagement.
Paul Resnick, Eytan Adar, and Cliff Lampe from University of Michigan. MTogether: A Living Lab for Social Media Research.
Marti Hearst from UC Berkeley. Understanding Social Learning Among Subgroups Within Large Online Learning Environments.
David Karger and Rob Miller from MIT. Crowdsourced Curation of Conversations.
Robert Kraut, Laura Dabbish, Jason Hong, Aniket Kittur from CMU. Successfully Starting Online Groups.

We look forward to working with these researchers, and we hope that we will jointly push the frontier of social interactions research to the next level.

References
[1] Aral, S., & Walker, D. (2012). Identifying Influential and Susceptible Members of Social Networks. Science , 337 (6092 ), 337–341. doi:10.1126/science.1215842
[2] Walton, G. M., & Cohen, G. L. (2011). A Brief Social-Belonging Intervention Improves Academic and Health Outcomes of Minority Students. Science , 331 (6023 ), 1447–1451. doi:10.1126/science.1198364
[3] Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed H. Chi. He Says, She Says: Conflict and Coordination in Wikipedia. In Proc. of ACM Conference on Human Factors in Computing Systems (CHI2007), pp. 453--462, April 2007. ACM Press. San Jose, CA.
[4] Meltzoff, A. N., Kuhl, P. K., Movellan, J., & Sejnowski, T. J. (2009). Foundations for a New Science of Learning. Science , 325 (5938), 284–288. doi:10.1126/science.1175626
[5] Smith, M. K., Wood, W. B., Adams, W. K., Wieman, C., Knight, J. K., Guild, N., & Su, T. T. (2009). Why Peer Discussion Improves Student Performance on In-Class Concept Questions. Science , 323 (5910), 122–124. doi:10.1126/science.1165919

Posted in Research Awards, University Relations | No comments

Friday, 8 March 2013

Learning from Big Data: 40 Million Entities in Context

Posted on 10:30 by Unknown

Posted by Dave Orr, Amar Subramanya, and Fernando Pereira, Google Research

When someone mentions Mercury, are they talking about the planet, the god, the car, the element, Freddie, or one of some 89 other possibilities? This problem is called disambiguation (a word that is itself ambiguous), and while it’s necessary for communication, and humans are amazingly good at it (when was the last time you confused a fruit with a giant tech company?), computers need help.

To provide that help, we are releasing the Wikilinks Corpus: 40 million total disambiguated mentions within over 10 million web pages -- over 100 times bigger than the next largest corpus (about 100,000 documents, see the table below for mention and entity counts). The mentions are found by looking for links to Wikipedia pages where the anchor text of the link closely matches the title of the target Wikipedia page. If we think of each page on Wikipedia as an entity (an idea we’ve discussed before), then the anchor text can be thought of as a mention of the corresponding entity.

Dataset	Number of Mentions	Number of Entities
Bentivogli et al. (data) (2008)	43,704	709
Day et al. (2008)	less than 55,000	3,660
Artiles et al. (data) (2010)	57,357	300
Wikilinks Corpus	40,323,863	2,933,659

What might you do with this data? Well, we’ve already written one ACL paper on cross-document co-reference (and received lots of requests for the underlying data, which partly motivates this release). And really, we look forward to seeing what you are going to do with it! But here are a few ideas:

Look into coreference -- when different mentions mention the same entity -- or entity resolution -- matching a mention to the underlying entity
Work on the bigger problem of cross-document coreference, which is how to find out if different web pages are talking about the same person or other entity
Learn things about entities by aggregating information across all the documents they’re mentioned in
Type tagging tries to assign types (they could be broad, like person, location, or specific, like amusement park ride) to entities. To the extent that the Wikipedia pages contain the type information you’re interested in, it would be easy to construct a training set that annotates the Wikilinks entities with types from Wikipedia.
Work on any of the above, or more, on subsets of the data. With existing datasets, it wasn’t possible to work on just musicians or chefs or train stations, because the sample sizes would be too small. But with 10 million Web pages, you can find a decent sampling of almost anything.

Gory Details

How do you actually get the data? It’s right here: Google’s Wikilinks Corpus. Tools and data with extra context can be found on our partners’ page: UMass Wiki-links. Understanding the corpus, however, is a little bit involved.

For copyright reasons, we cannot distribute actual annotated web pages. Instead, we’re providing an index of URLs, and the tools to create the dataset, or whichever slice of it you care about, yourself. Specifically, we’re providing:

The URLs of all the pages that contain labeled mentions, which are links to English Wikipedia
The anchor text of the link (the mention string), the Wikipedia link target, and the byte offset of the link for every page in the set
The byte offset of the 10 least frequent words on the page, to act as a signature to ensure that the underlying text hasn’t changed -- think of this as a version, or fingerprint, of the page
Software tools (on the UMass site) to: download the web pages; extract the mentions, with ways to recover if the byte offsets don’t match; select the text around the mentions as local context; and compute evaluation metrics over predicted entities.

The format looks like this:

URL http://1967mercurycougar.blogspot.com/2009_10_01_archive.html

MENTION Lincoln Continental Mark IV 40110 http://en.wikipedia.org/wiki/Lincoln_Continental_Mark_IV

MENTION 1975 MGB roadster 41481 http://en.wikipedia.org/wiki/MG_MGB

MENTION Buick Riviera 43316 http://en.wikipedia.org/wiki/Buick_Riviera

MENTION Oldsmobile Toronado 43397 http://en.wikipedia.org/wiki/Oldsmobile_Toronado

TOKEN seen 58190

TOKEN crush 63118

TOKEN owners 69290

TOKEN desk 59772

TOKEN relocate 70683

TOKEN promote 35016

TOKEN between 70846

TOKEN re 52821

TOKEN getting 68968

TOKEN felt 41508

We’d love to hear what you’re working on, and look forward to what you can do with 40 million mentions across over 10 million web pages!

Thanks to our collaborators at UMass Amherst: Sameer Singh and Andrew McCallum.

Posted in Natural Language Processing, wikipedia | No comments

Compact System

Wednesday, 27 March 2013

Education Awards on Google App Engine

Wednesday, 13 March 2013

Scaling Computer Science Education

Tuesday, 12 March 2013

Our Commitment to Social Computing Research: Social Interactions Focused Awards Announcement

Friday, 8 March 2013

Learning from Big Data: 40 Million Entities in Context

Popular Posts

Categories

Blog Archive

About Me