Data-govWiki: Towards Linking Government Data

Data.gov is a website that provides US Government data to the general public to ensure better accountability and transparency. Our recent work on the Data-gov Wiki, which attempts to integrate the datasets published at Data.gov into the Linking Open Data (LOD) cloud (yielding ”linked government data”), has produced 5 billion triples covering a range of topics including: government spending, environmental records, and statistics on the cost and usage of public services. In this paper, we investigate the role of Semantic Web technologies in converting, enhancing and using linked government data. In particular, we show how government data can be (i) inter-linked by sharing the same terms and URIs, (ii) linked to existing data sources ranging from the LOD cloud (e.g. DBpedia) to the conventional web (e.g. the New York Times), and (iii) cross-linked by their knowledge provenance (which captures, among other things, derivation and revision histories).

View Publication

Associated Projects

The Inference Web is a Semantic Web based knowledge provenance infrastructure that supports interoperable explanations of sources, assumptions, learned information, and answers as an enabler for trust.

The LOGD project investigates the role of Semantic Web technologies, especially Linked Data, in producing, enhancing and utilizing government data published on Data.gov and other websites. Large portion of government data published on the Web are not necessarily ready for mashups. The Tetherless World Constellation (TWC) is now publishing over 8 billions RDF triples converted from hundreds of government-related datasets from Data.gov and other sources (e.g.

Citation