Compact System

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Thursday, 28 July 2011

President's Council Recommends Open Data for Federal Agencies

Posted on 10:58 by Unknown
Posted by Alon Halevy, Senior Staff Research Scientist

Cross-posted with the Public Sector and Elections Lab Blog

One of the things I most enjoy about working on data management is the ability to work on a variety of problems, both in the private sector and in government. I recently had the privilege of serving on a working group of the President’s Council of Advisors on Science and Technology (PCAST) studying the challenges of conserving the nation’s ecosystems. The report, titled “Sustaining Environmental Capital: Protecting Society and the Economy” was presented to President Obama on July 18th, 2011. The full report is now available to the public.

The press release announcing the report summarizes its recommendations:
The Federal Government should launch a series of efforts to assess thoroughly the condition of U.S. ecosystems and the social and economic value of the services those ecosystems provide, according to a new report by the President’s Council of Advisors on Science and Technology (PCAST), an independent council of the Nation’s leading scientists and engineers. The report also recommends that the Nation apply modern informatics technologies to the vast stores of biodiversity data already collected by various Federal agencies in order to increase the usefulness of those data for decision- and policy-making.

One of the key challenges we face in assessing the condition of ecosystems is that a lot of the data pertaining to these systems is locked up in individual databases. Even though this data is often collected using government funds, it is not always available to the public and in other cases available but not in usable formats. This is a classical example of a data integration problem that occurs in many other domains.

The report calls for creating an ecosystem, EcoINFORMA, around data. The crucial piece of this ecosystem is to make the relevant data publicly available in a timely manner and, most importantly, in a machine readable form. Publishing data embedded in a PDF file is a classical example of what does not count as being machine readable. For example, if you are publishing a tabular data set, then a computer program should be able to directly access the meta-data (e.g., column names, date collected) and the data rows without having to heuristically extract it from surrounding text.

Once the data is published, it can be discovered by search engines. Data from multiple sources can be combined to provide additional insight, and the data can be visualized and analyzed by sophisticated tools. The main point is that innovation should be pursued by many parties (academics, commercial, government), each applying their own expertise and passions.

There is a subtle point about how much meta-data should be provided before publishing the data. Unfortunately, requiring too much meta-data (e.g., standard schemas) often stymies publication. When meta-data exists, that’s great, but when it’s not there or is not complete, we should still publish the data in a timely manner. If the data is valuable and discoverable, there will be someone in the ecosystem who will enhance the data in an appropriate fashion.

I look forward to seeing this ecosystem evolve and excited that Google Fusion Tables, our own cloud-based service for visualizing, sharing and integrating structured data, can contribute to its development.
Email ThisBlogThis!Share to XShare to Facebook
Posted in Fusion Tables, Government, Policy, Structured Data | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • CDC Birth Vital Statistics in BigQuery
    Posted by Dan Vanderkam, Software Engineer Google’s BigQuery Service lets enterprises and developers crunch large-scale data sets quickly...
  • Towards Energy-Proportional Datacenters
    Posted by Dennis Abts, Michael R. Marty, Philip M. Wells, Peter Klausler, and Hong Liu This is part of the series highlighting some notable...
  • Site Reliability Engineers: “solving the most interesting problems”
    Posted by Chris Reid, Sydney Staffing team I recently sat down with Ben Appleton, a Senior Staff Software Engineer, to talk about his recent...
  • Our Faculty Institute brings faculty back to the drawing board
    Posted by Nina Kim Schultz, Google Education Research Cross-posted with the Official Google Blog School may still be out for summer, but tea...
  • Market Algorithms and Optimization Meeting
    Posted by  Vahab S. Mirrokni and Muthu Muthukrishnan Google auctions ads, and enables a market with millions of advertisers and users.  This...
  • Our Unique Approach to Research
    Posted by  Alfred Spector , Vice President of Research and Special Initiatives Google started as a research project —and research has remain...
  • Impact of Organic Ranking on Ad Click Incrementality
    Posted by David Chan, Statistician and Lizzy Van Alstine, Research Evangelist  In 2011, Google released a Search Ads Pause research study w...
  • Large-scale graph computing at Google
    Posted by Grzegorz Czajkowski, Systems Infrastructure Team If you squint the right way, you will notice that graphs are everywhere. For exam...
  • Continuing the quest for future computer scientists with CS4HS
    Erin Mindell, Program Manager, Google Education Computer Science for High School (CS4HS) began five years ago with a simple question: How c...
  • Millions of Core-Hours Awarded to Science
    Posted by Andrea Held, Program Manager, University Relations In 2011 Google University Relations launched a new academic research awards pr...

Categories

  • accessibility
  • ACL
  • ACM
  • Acoustic Modeling
  • ads
  • adsense
  • adwords
  • Africa
  • Android
  • API
  • App Engine
  • App Inventor
  • Audio
  • Awards
  • Cantonese
  • China
  • Computer Science
  • conference
  • conferences
  • correlate
  • crowd-sourcing
  • CVPR
  • datasets
  • Deep Learning
  • distributed systems
  • Earth Engine
  • economics
  • Education
  • Electronic Commerce and Algorithms
  • EMEA
  • EMNLP
  • entities
  • Exacycle
  • Faculty Institute
  • Faculty Summit
  • Fusion Tables
  • gamification
  • Google Books
  • Google+
  • Government
  • grants
  • HCI
  • Image Annotation
  • Information Retrieval
  • internationalization
  • Interspeech
  • jsm
  • jsm2011
  • K-12
  • Korean
  • Labs
  • localization
  • Machine Hearing
  • Machine Learning
  • Machine Translation
  • MapReduce
  • market algorithms
  • Market Research
  • ML
  • MOOC
  • NAACL
  • Natural Language Processing
  • Networks
  • Ngram
  • NIPS
  • NLP
  • open source
  • operating systems
  • osdi
  • osdi10
  • patents
  • ph.d. fellowship
  • PiLab
  • Policy
  • Public Data Explorer
  • publication
  • Publications
  • renewable energy
  • Research Awards
  • resource optimization
  • Search
  • search ads
  • Security and Privacy
  • SIGMOD
  • Site Reliability Engineering
  • Speech
  • statistics
  • Structured Data
  • Systems
  • Translate
  • trends
  • TV
  • UI
  • University Relations
  • UNIX
  • User Experience
  • video
  • Vision Research
  • Visiting Faculty
  • Visualization
  • Voice Search
  • Wiki
  • wikipedia
  • WWW
  • YouTube

Blog Archive

  • ►  2013 (51)
    • ►  December (3)
    • ►  November (9)
    • ►  October (2)
    • ►  September (5)
    • ►  August (2)
    • ►  July (6)
    • ►  June (7)
    • ►  May (5)
    • ►  April (3)
    • ►  March (4)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (59)
    • ►  December (4)
    • ►  October (4)
    • ►  September (3)
    • ►  August (9)
    • ►  July (9)
    • ►  June (7)
    • ►  May (7)
    • ►  April (2)
    • ►  March (7)
    • ►  February (3)
    • ►  January (4)
  • ▼  2011 (51)
    • ►  December (5)
    • ►  November (2)
    • ►  September (3)
    • ►  August (4)
    • ▼  July (9)
      • President's Council Recommends Open Data for Feder...
      • Studies Show Search Ads Drive 89% Incremental Traffic
      • Faculty from across the Americas meet in New York ...
      • Google Americas Faculty Summit: Reflections from o...
      • Google Americas Faculty Summit Day 2: Shopping, Co...
      • Google Americas Faculty Summit Day 1: Cluster Mana...
      • Google Americas Faculty Summit Day 1: Mobile Search
      • What You Capture Is What You Get: A New Way for Ta...
      • Languages of the World (Wide Web)
    • ►  June (6)
    • ►  May (4)
    • ►  April (4)
    • ►  March (5)
    • ►  February (5)
    • ►  January (4)
  • ►  2010 (44)
    • ►  December (7)
    • ►  November (2)
    • ►  October (9)
    • ►  September (7)
    • ►  August (2)
    • ►  July (7)
    • ►  June (3)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ►  2009 (44)
    • ►  December (8)
    • ►  November (4)
    • ►  August (4)
    • ►  July (5)
    • ►  June (5)
    • ►  May (4)
    • ►  April (6)
    • ►  March (3)
    • ►  February (1)
    • ►  January (4)
  • ►  2008 (11)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  July (1)
    • ►  May (3)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
  • ►  2007 (9)
    • ►  October (1)
    • ►  September (2)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  February (2)
  • ►  2006 (15)
    • ►  December (1)
    • ►  November (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  April (3)
    • ►  March (4)
    • ►  February (1)
Powered by Blogger.

About Me

Unknown
View my complete profile