Data types and persistent identifiers in the Deep Carbon Observatory Data Portal

Printer-friendly version

Presented at the RDA Fifth Plenary Meeting

Abstract:

The Research Data Alliance (RDA) - Data Type Registry (DTR) working group output addressed a core issue relevant to data interoperability: to parse, understand, and potentially reuse data retrieved from others. DTR explores ways to enable data creators to record and make explicit the implicit assumptions of a dataset. The RDA - Persistent Identifier Information Types (PIT) working group addressed the essential types of information associated with persistent identifiers (PID). The PIT group developed a conceptual model for structuring typed information and an application programming interface for access to the typed information. The Deep Carbon Observatory (DCO; http://deepcarbon.net) Data Portal enables centrally-managed digital object identification, object registration and metadata management. The portal also provides a digital object registration process for DCO Community members. DCO anticipated a large number of digital object registrations, and therefore needed an appropriate mechanism to curate and reuse the registered information. The primitives in a DTR are comparable to a list of basic data type classes in the DCO ontology, such as Dataset, Image, Video, and Audio, etc. The properties associated with each PID information type in PIT are comparable to the properties associated with those data type classes in the DCO ontology. Currently, a registered DCO dataset is regarded as an instance of one of those classes. Thus there is potential to further annotate a registered dataset with the specific data types defined within a DTR, and in turn each data type has a PID. In order to deploy the RDA DTR and PIT in the DCO Data Portal, references to the existing DCO ontologies for basic data types are required, in addition to the likely identification of new use cases from the DCO community for their specific data types. Initial work using the DTR API in the DCO Data Portal retrieves a list of registered specific data types in a DTR and uses them in the data registration workflow. The deployment of RDA’s DTR and PIT outputs for the DCO Data Portal is expected to significantly facilitate data curation and re-use. Feedback from the accompanying DCO data science activities will also help promote the RDA DTR and PIT outputs to a broader global science community.

History

DateCreated ByLink
March 5, 2015
15:56:31
Xiaogang MaDownload
March 4, 2015
16:26:56
Xiaogang MaDownload
March 4, 2015
16:18:37
Xiaogang MaDownload

Related Projects:

DCO-DS LogoDeep Carbon Observatory Data Science (DCO-DS)
Principal Investigator: Peter Fox
Co Investigator: John S. Erickson and Jim Hendler
Description: Given this increasing data deluge, it is clear that each of the Directorates in the Deep Carbon Observatory face diverse data science and data management needs to fulfill both their decadal strategic objectives and their day-to-day tasks. This project will assess in detail the data science and data management needs for each DCO directorate and for the DCO as a whole, using a combination of informatics methods; use case development, requirements analysis, inventories and interviews.
TW LogoResearch Data Alliance Adoption Initiatives (RDA Adoption)
Principal Investigator: Peter Fox
Co Investigator: Xiaogang Ma
Description: The Research Data Alliance (RDA) - Data Type Registry (DTR) Working Group addresses a part of a core problem relevant to interoperability among data management systems: the ability to parse, understand, and potentially reuse data retrieved from others. The RDA - Persistent Identifier Information Types (PIT) Working Group addresses the essential types of information associated with persistent identifiers. We have undertaken an effort to adopt the DTR and PIT outputs in the Data Portal of the Deep Carbon Observatory (DCO) and have received positive results.