Parallel Identities for Managing Open Government Data

The widespread availability of Open Government Data is exposing significant challenges to trust in its unplanned applications. As data are accumulated, transformed, and presented through a chain of independent third parties, there is a growing need for sophisticated models of provenance. Significant progress has been made in describing data derivation, but has been limited by its ability to distinguish between transformations that change content and transformations that simply change representation. We have found that Functional Requirements for Bibliographic Resources (FRBR) can, when paired with a derivational provenance model like the World Wide Web Consortiumtextquoteright{}s emerging PROV standard, successfully represent web resource accession, distinguish between transformations of content and format, and facilitate veracity using cryptographic digests. We show how cryptographic digest algorithms can be used to provide an automated method and tools for the coordination of multiscale identity of information resources using FRBR concepts and cryptographic digests.

Associated Projects

The Inference Web is a Semantic Web based knowledge provenance infrastructure that supports interoperable explanations of sources, assumptions, learned information, and answers as an enabler for trust.

The LOGD project investigates the role of Semantic Web technologies, especially Linked Data, in producing, enhancing and utilizing government data published on and other websites. Large portion of government data published on the Web are not necessarily ready for mashups. The Tetherless World Constellation (TWC) is now publishing over 8 billions RDF triples converted from hundreds of government-related datasets from and other sources (e.g.

The National Cancer Institute’s (NCI) PopSciGrid Community Health Portal is an evolving platform demonstrating how health behavior, policy, and demographic data can be integrated, visualized, and communicated to empower communities and support new avenues of research and policy for cancer prevention and control. As a proof of concept for cyber-enabled population health research, the PopSciGrid Portal is designed to encourage trans-disciplinary collaboration, data harmonization, and development of new computational methods for disparate health related data.