Child Health Exposure Analysis Repository (CHEAR)
Principal Investigator: Deborah L. McGuinness
Co Investigator: Kristin Bennett
Description: Child Health Exposure Analysis Repository Data Science Semantics
Deep Carbon Observatory Data Science (DCO-DS)
Principal Investigator: Peter Fox
Co Investigator: John S. Erickson and Jim Hendler
Description: Given this increasing data deluge, it is clear that each of the Directorates in the Deep Carbon Observatory face diverse data science and data management needs to fulfill both their decadal strategic objectives and their day-to-day tasks. This project will assess in detail the data science and data management needs for each DCO directorate and for the DCO as a whole, using a combination of informatics methods; use case development, requirements analysis, inventories and interviews.
Deep Time Data Infrastructure (DTDI)
Principal Investigator: Peter Fox
Description: Earth’s living and non-living components have co-evolved for 4 billion years through numerous positive and negative feedbacks. Yet our ability to document, model, and explore these complex intertwined changes has been hampered by a lack of data synthesis and integration from many complementary disciplines—mineralogy, petrology, paleobiology, geochronology, proteomics, geochemistry, and more. The rise of oxygen exemplifies the co-evolution of rocks and life, and underscores both the tantalizing opportunities and technical challenges of deciphering transient characteristics of Earth’s storied past.
Developing Ontologies for Additive Manufacturing Processes (DOfAMP)
Principal Investigator: Jim Hendler
Co Investigator: Peter Fox and Robert Hull
Description: We propose the development of the field of materials processing ontology so that the US establishes leadership in this critical technological arena. The goal is the development of a framework, language and algorithm set for organizing and categorizing the myriad relationships between materials processing, properties and structure. No ubiquitous framework currently exists for relating materials processing parameters to properties and structure that translates across multiple materials fields and technologies. In essence, an advanced “Dewey Decimal System” is needed for materials processing, such that data and knowledge that is developed in one materials processing technology can cross-pollinate across other materials technologies.
E-Science Jefferson Project on Lake George (Jefferson Project)
Principal Investigator: Deborah L. McGuinness
Co Investigator: Paulo Pinheiro
Description: The Jefferson Project at Lake George is building one of the world’s most sophisticated environmental monitoring and prediction systems, which will provide scientists and the community with a real-time picture of the health of the lake. Launched in June 2013, the project aims to understand and manage multiple complex factors—including road salt incursion, storm water runoff, and invasive species—all threatening one of the world’s most pristine natural ecosystems and an economic cornerstone of the New York tourism industry. The project is a three-year, multimillion-dollar collaboration between Rensselaer Polytechnic Institute, IBM, and The FUND for Lake George. The collaboration partners expect that the world-class scientific and technology facility at the Rensselaer Darrin Fresh Water Institute at Lake George will create a new model for predictive preservation and remediation of critical natural systems in Lake George, in New York, and ultimately around the world.
EAGER: Semantic Search (EAGER)
Principal Investigator: Jim Hendler
Description: NSF EAGER project to explore advanced semantic technology for data search.
First Responders Requirements Metholodology (FirstResponders)
Principal Investigator: Deborah L. McGuinness
Co Investigator: John S. Erickson
Description: The purpose of this project is to design and prototype a requirements-gathering methodology driven by the first responders community. The methodology will include examining the current state of collecting and synthesizing responder requirements, assessing the effectiveness of that process, evaluating existing candidate platforms for use within this community, and producing a roadmap that can be used by NIST and others to achieve a solution enabling the responder community to more effectively dialogue with key stakeholders. A prototype implementation of the methodology will be developed using the roadmap and will be available for testing and evaluation and requirements gathering.
Health Data Challenge (HealthData)
Principal Investigator: Jim Hendler and Deborah L. McGuinness
Co Investigator: Kristine Gloria, Alvaro Graves, Tim Lebo, and James McCusker
Description: An infrastructure for large-scale collaboration around aggregation, generation, and publication of health-related Linked Data.
Health on the Web
Principal Investigator: Deborah L. McGuinness and Joanne S. Luciano
Description: The Tetherless World Constellation's Health on the Web's primary goal is to explore the next generation web technology needed to improve health.
Inference Web
Principal Investigator: Deborah L. McGuinness
Description: The Inference Web is a Semantic Web based knowledge provenance infrastructure that supports interoperable explanations of sources, assumptions, learned information, and answers as an enabler for trust. Provenance - if users (humans and agents) are to use and integrate data from unknown, uncertain, or multiple sources, they need provenance metadata for evaluation Interoperability - more systems are using varied sources and multiple information manipulation engines, thus increasing interoperability requirements Explanation/Justification - if information has been manipulated (i.e., by sound deduction or by heuristic processes), information manipulation trace information should be available Trust - if some sources are more trustworthy than others, trust ratings are desired The Inference Web consists of two important components: Proof Markup Language (PML) Ontology - Semantic Web based representation for exchanging explanations including provenance information - annotating the sources of knowledge justification information - annotating the steps for deriving the conclusions or executing workflows trust information - annotating trustworthiness assertions about knowledge and sources IW Toolkit - Web-based and standalone tools that facilitate human users to browse, debug, explain, and abstract the knowledge encoded in PML.
Linking Open Government Data (LOGD)
Principal Investigator: Deborah L. McGuinness and Jim Hendler
Description: The LOGD project investigates the role of Semantic Web technologies, especially Linked Data, in producing, enhancing and utilizing government data published on and other websites.
Marine Biodiversity Virtual Laboratory (MBVL)
Principal Investigator: Peter Fox, Heidi Sosik, Stace Beaulieu, and David Mark Welch
Description: This research effort brings together computational and information scientists, oceanographers and microbiologists to develop a Marine Biodiversity Virtual Laboratory (MBVL). In addition to research investigations of marine ecosystems, the Virtual Laboratory provides a platform for education via student diversity programs at the three institutions. The important learning opportunities will be two-fold for students: (1) to learn about, model, and make predictions for biodiversity in natural systems, and (2) to be exposed to working in an interdisciplinary team that includes both natural scientists and computer scientists.
Mobile Health
Principal Investigator: Deborah L. McGuinness
Description: The Mobile Health project aims to bring semantic representations of medical data collected from a variety of consumer and medical grade devices and integrate those data on an individual's mobile smartphone. Combined with the reasoning capabilities of semantic web and technologies such as IBM Watson, this project plans to enable personalized health care through the instrumented self.
National Ocean Council Vocabulary (NOCV)
Description: The objective of the NOCV project is to demonstrate technical capabilities that are available and can be deployed to implement solutions to key needs identified in the National Ocean Policy in regard to data and the decision-support requirements that arise from data-oriented questions.
Nightingale: Proactive Depression Treatment with Individual Social, Sensory and Virtual Technologies. (Nightingale)
Principal Investigator: Joanne S. Luciano, Mei Si, and Jonas Braasch
Description: Depression costs! Each year, billions of dollars are wasted and millions of lives are disrupted because depression is complex, access is limited, treatments are one-size-fits-all, and therapies are trial and error. Nightingale aims to develop innovative solutions using social machines, virtual reality, and pervasive sensor technologies. The goals are: (1) predict an upcoming depression based on personalized features and cognitive modeling, (2) intervene using intelligent synthetic characters and augmented realities with telepresence capabilities for therapists, and (3) provide intelligent tools to users to inform themselves about their condition. Nightingale monitors the user using non-invasive cameras and biosensors, web-based weather data and information about the user’s daily activities. Nightingale intervenes with constructive suggestions, a positive environment, or an alert that medical help is needed. Together, these solutions can better target the right treatments for the right patients at the right time.
Description: Tasks for various TWC projects related to data access and the OPeNDAP software products.
ORGPedia Corporate Intelligence (ORGPedia)
Principal Investigator: Jim Hendler
Description: This project is for creating prototypes of linking open corporate data for the ORGPedia project. It will be a portal for integrated disparate datasets about corporations across levels of government and agencies.
Ontology-Enabled Polymer Nanocomposite Open Community Data Resource (Nanomine)
Principal Investigator: Deborah L. McGuinness, Cate Brinson, Wei Chen, and Linda Schadler
Description: Our evolving semantics=driven data resource, named NanoMine, is an open access, user friendly, living, growing, data resource for the polymer nanocomposites community that is scalable and enables improved understanding of processing – structure - property relationships and thus facilitates faster nanocomposite design and insertion into advanced applications. By bringing together the data that is scattered throughout the public literature and private files and creating a protocol for recording and tagging data, this resource is an unprecedented compilation of information that is accessible. Tools within the resource allow users to visualize complex data, analyze images from their work, and design new polymer nanocomposites materials. For NanoMine to realize broad community acceptance and address scientific questions at the forefront of technology, it marries cutting edge cyber infrastructure with a robust set of data and tools.
Population Science Grid (PopSciGrid)
Principal Investigator: Deborah L. McGuinness
The National Cancer Institute's (NCI) PopSciGrid Community Health Portal is an evolving platform demonstrating how health behavior, policy, and demographic data can be integrated, visualized, and communicated to empower communities and support new avenues of research and policy for cancer prevention and control. As a proof of concept for cyber-enabled population health research, the PopSciGrid Portal is designed to encourage trans-disciplinary collaboration, data harmonization, and development of new computational methods for disparate health related data. Rensselaer Polytechnic Institute Data Services (
Principal Investigator: Peter Fox and Jim Hendler
Description: Providing data storage, data services, data access, data discovery, data search, and data lifecycle and management for RPI research projects.
Repurposing Drugs with Semantics (ReDrugS)
Principal Investigator: Deborah L. McGuinness and Jonathan Dordick
Description: We aim to find new effective treatments for disease using existing drugs. Our approach is to gather and integrate existing data using semantic technologies to help discover promising drug repurposing.
SemantEco Annotator
Principal Investigator: Deborah L. McGuinness
Co Investigator: Patrice Seyed
Description: Generating useful RDF linked data is not a straightforward process for scientists using today's tools. In this project we introduce the SemantEco Annotator, a semantic web application that leverages community-based vocabularies and ontologies during the translation process itself to ease the process of drawing out implicit relationships in tabular data so that they may be immediately available for use within the LOD cloud. Our goal for the SemantEco Annotator is to make advanced RDF translation techniques available to the layperson.
Semantic Numeric Exploration Technology (SemNExT)
Principal Investigator: Kristin Bennett and Deborah L. McGuinness
Description: SemNExT combines numeric analysis of data with semantic understanding and explanation technologies to provide a holistic means of exploring robust datasets.
Semantic Sea Ice Interoperability Initiative (SSIII)
Principal Investigator: Ruth Duerr, Siri Jodha Singh Khalsa, and Mark Parsons
Co Investigator: Peter Fox and Deborah L. McGuinness
Description: SSIII is a National Science Foundation (NSF) funded effort to enhance the interoperability of sea ice data to establish a network of practitioners working to enhance semantic interoperability of all Arctic data. SSIII is a collaborative project between NSIDC and the Rensselaer Polytechnic Institute (RPI) Tetherless World Constellation project. We seek to build on the work initiated under the International Polar Year (IPY) and create a community of practice working to improve interoperability within the Polar Information Commons (PIC), the Sustained Arctic Observing Network (SAON), and broader global systems.
Semantic Vernaculars for Fungi (SVF)
Principal Investigator: Deborah L. McGuinness
Co Investigator: Nathan Wilson
Description: Fungi are typically referred to by either scientific or common names. Neither of these terminologies meets the need for well-defined, persistent definitions of groups of fungi who exhibit similar macroscopic qualities, but may be dissimilar genetically. We propose a community-developed vocabulary that can be used to identify mushrooms based on properties that can be observed in the field (without microscopic or genomic examination). We show how an ontology can be used to develop and organize the terms and definitions and to enable applications based on the vocabulary.
Semantic Water Quality Portal (SemantAQUA)
Principal Investigator: Deborah L. McGuinness
Co Investigator: Joanne S. Luciano
Description: We present a semantic technology-based approach to emerging environmental information systems. We used our linked data approach in the Tetherless World Constellation Semantic Water Quality Portal (TWC-SWQP). Our integration scheme uses a core domain ontology and integrates water data from different authoritative sources along with multiple regulation ontologies to enable pollution detection and monitoring. An OWL-based reasoning scheme identifies pollution events relative to user chosen regulations. Our approach also captures and leverages provenance to improve transparency. In addition, semantic water quality portal features provenance-based facet generation, query answering and data validation over the integrated data via SPARQL. We introduce the approach and the water portal, and highlight some of its potential impacts for the future of environmental monitoring systems.
Semantically Enabled Facetd Browser (S2S)
Principal Investigator: Peter Fox
Co Investigator: Stephan Zednik
Description: S2S is a user interface framework that leverages the machine-readable semantics of data, services, and user interface components, or "widgets". S2S automates various tasks in UI development for search interfaces.
Semantically Enabled Modeling of Major Depressive Disorder (SEMMDD)
Principal Investigator: Joanne S. Luciano
Description: In this project, we study the effects of how different antidepressant treatments, including non-pharmacological treatments, affect the underlying brain regions, clinical symptoms, and behaviors. We use mathematical modeling and computer simulation to combine clinical research with neuroscience research.
Semantically-Enabled Science Data Integration (SESDI)
Principal Investigator: Peter Fox
Co Investigator: Deborah L. McGuinness
Description: The vast majority of explorations of the Earth system are limited in their ability to effectively explore the most important (often most difficult) problems because they are forced to interconnect at the data-element, or syntactic, level rather than at a higher scientific, or semantic, level. In many cases, syntax-only interoperability IS the state-of-the-art. In order for scientists and non-scientists to discover, access, and use data from unfamiliar sources, they are forced to learn details of the data schema, other people¿s naming schemes and syntax decisions. Our work is aimed at providing scientists with the option of describing what they are looking for in terms that are meaningful and natural to them, instead of in a syntax that is not. The missing element in enabling the higher-level interconnections is the technology of ontologies, ontology-equipped tools, and semantically aware interfaces between science components. Ontologies fill a major technology gap in machine-to-machine communication across multiple disciplines to advance Earth system science by enabling data integration without the need for human intervention. This project, the Semantically-Enabled Science Data Integration (SESDI), will demonstrate how ontologies implemented within existing distributed technology frameworks will provide essential, re-useable, and robust, support for an evolution to science measurement processing systems (or frameworks) as well as for data and information systems (or framework) support for NASA Science Focus Areas and Applications.
Social Practices (SPP)
Principal Investigator: Jim Hendler
Description: The overall goal of this project is to explore and establish a better understanding of privacy in this highly-networked world. This page features the tools and workflow needed to accomplish such a task. We argue that while much has been written and discussed about privacy in various domains (e.g., law, psychology, economic behavior, security, etc.), it remains unclear what exactly is the privacy problem? Our aim is to reframe our own understanding of privacy by moving away from these traditional disjointed compartments of knowledge. Moreover, given the complexity, we advocate this research question as an exemplar for the value of combining efforts between human and machine. This project features tools, workflow(s) and best practices we've developed and implemented to accomplish such a task. This is and will be a work in progress. Any comments and or feedback are welcomed. Please email Kristine Gloria at for more information.
Streaming Data Characterization (SDC)
Principal Investigator: Deborah L. McGuinness and Mark Greaves
Description: This project aims to develop a flexible window management strategies and algorithms for stream reasoning. We have proposed a stack of technologies including sequential stream reasoning architecture, the notion of semantic importance. Project Poster link: Project Slides link:
TW Website Project
Description: A semantically-powered Tetherless World Website running in the Drupal CMS. This combines many web standard technologies, including RDF, SPARQL, XSLT, and XHTML.
TWC Vocabulary Development (TWC_Schemas)
Principal Investigator: Jim Hendler
Co Investigator: Joshua Shinavier
Description: provides a collection of schemas — html tags — that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Since early 2012 researchers at TWC RPI have been working with government and research data providers to define vocabularies for expressing the structured data that powers their web sites, using on-page markup based on vocabularies. In particular, we developed the extension, a concise vocabulary that extends for describing datasets and data catalogs. Current work includes applying Dataset to scientific datasets and developing new extensions for use by Web Observatories
TWC Web Observatory (WebObservatory)
Principal Investigator: Deborah L. McGuinness
Co Investigator: Jim Hendler
Description: The Web Science Research Center at TWC RPI is working with other members of the Web Science Trust to create a global "Web Observatory". The global movement toward Open Data and transparency have successfully motivated the release of very large institutional and commercial data sets describing social phenomena, economic indicators and geographic trends. This proliferation of data represents great opportunity for researchers and industry but this data abundance also threatens to make it ever more difficult to locate, analyse, compare and interpret useful information in a consistent and reliable way; a situation which can only get worse unless we can help stakeholders perform useful analysis rather than drowning in a sea of data. A global Web Observatory will offer an institutional framework to promote the use of W3C and other standards in the development of Semantic Catalogues to globally locate existing data sets, Collection Systems to gather new global data sets, and Analytics Tools and methodologies to analyse these data sets.
Tea Ontology (ROBOT)
Description: Class project for Ontology Engineering Spring 2016, by Cara Reedy and Katie Chastain
The Asthma Files (TAF)
Principal Investigator: Michael Fortun
Co Investigator: Kim Fortun and Peter Fox
Description: The Asthma Files is an electronic archive of text, still images, video and audio that illustrate multiple perspectives on asthma-- from the vantage point of affected people in different locales and communities, heath care providers, and scientists from many different disciplines.
The Human-Aware Data Acquisition Framework (HADatAc)
Principal Investigator: Paulo Pinheiro
Co Investigator: Deborah L. McGuinness
ToolMatch (ToolMatch)
Description: or a given dataset, it is difficult to find the tools that can be used to work with the dataset. In many cases, the information that Tool A works with Dataset B is somewhere on the Web, but not in a readily identifiable or discoverable form. In other cases, particularly more generalized tools, the information does not exist at all, until somebody tries to use the tool on a given dataset. Thus, the simplest, most prevalent use case is for a user to search for the tools that can be used with a given dataset. A further refinement would be to specify what the tool can do with the dataset, e.g., read, visualize, map, analyze, reformat.
Web Science Research Center (WSRC)
Principal Investigator: Deborah L. McGuinness
Co Investigator: John S. Erickson
Description: Web Science is the study of the World Wide Web and its impact on both society and technology, positioning the Web as an object of scientific study unto itself. Web Science recognizes the Web as a transformational, disruptive technology; its practitioners study the Web, its components, facets and characteristics. Ultimately, Web Science is about understanding the Web and anticipating how it might evolve in the future.
Wine Agent
Principal Investigator: Deborah L. McGuinness
Description: The Wine Agent represents knowledge of wines and foods and is a demonstration platform for a large variety of Semantic Web technologies in a rich domain and is derived from previous work in the field of reasoning systems.