News and Announcements

Printer-friendly version

Create an Announcement or News Items
Subscribe to News Feed

TWed Discussion: Using Semantic Data Dictionaries for Semantic Data Conversion in SETLrTWed Discussion: Using Semantic Data Dictionaries for Semantic Data Conversion in SETLr
March 27, 2017
TWed Discussion, Wednesday, March 29, 2017, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student Katie Chastain leads us in a tour-de-force discussion of the challenges presented when including data dictionaries and codebooks in knowledge graphs, and the use of SETLr to slay such dragons. Katie's talk will include a very brief review of SETLr for noobies...
TWed Discussion: Magellan - An ontology-driven in-browser faceted data explorerTWed Discussion: Magellan - An ontology-driven in-browser faceted data explorer
March 7, 2017
TWed Discussion, Wednesday, March 08, 2017, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as Alex Schwartzberg leads us in a discussion of a new lightweight ontology-driven JSON-LD explorer that has emerged from his recent DARPA-related work.
TWed discussion: Constructing and Maintaining CHEAR - A Community-Built and Evolved OntologyTWed discussion: Constructing and Maintaining CHEAR - A Community-Built and Evolved Ontology
February 27, 2017
TWed Discussion, Wednesday, March 01, 2017, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC grad student Sabbir Rashid leads us in a discussion of work related to an upcoming paper, "A Community-Built and Evolved Ontology and Data Standard for Childhood Health." Sabbir will also cover HADatAc and give a quick demo on one or two CHEAR related use cases using the system.
TWed discussion: Using Information Centrality for detecting systemic anomalies in large homogeneous networksTWed discussion: Using Information Centrality for detecting systemic anomalies in large homogeneous networks
February 21, 2017
TWed Discussion, Wednesday, February 22, 2017, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PdD student Nidhi Rastogi leads us in a discussion of her recent progress in performing anomaly detection in large networks using graph analytics.
TWed Discussion: Data science and the future of the built environment: Applying the work of Tetherless World to CASE's Active Modular Phytoremediation System (AMPS)TWed Discussion: Data science and the future of the built environment: Applying the work of Tetherless World to CASE's Active Modular Phytoremediation System (AMPS)
February 8, 2017
TWed Discussion, Wednesday, February 08, 2017, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as we welcome a special guest speaker, Josh Draper from RPI's Center for Architecture Science and Ecology (CASE). Josh will discuss CASE's innovative collaboration with TWC centering on data science and the built environment.
TWed Discussion: Semantic Markdown: Embedding Workflow Semantics via R MarkdownTWed Discussion: Semantic Markdown: Embedding Workflow Semantics via R Markdown
January 30, 2017
TWed Discussion, Wednesday, February 01, 2017, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as John Erickson discuss recent thoughts on using esp. R Markdown to extend the RStudio environment to enabling data analysts to directly generate and publish RDF that richly describes the semantics of their scripts. This is a possible next step towards best practices for "in situ" embedding of appropriate concepts and vocabulary from established ontologies (including ProvONE and domain ontologies) into practical workflows.
TWed discussion: Better Searching through Reformulated QueriesTWed discussion: Better Searching through Reformulated Queries
January 23, 2017
TWed Discussion, Wednesday, January 25, 2017, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD Student Amar Viswanathan talks about how query reformulation can bridge the gap between systems and humans.
Ph.D. Thesis Defense Announcement: Han Wang: Knowledge Base Construction from Scientific LiteraturePh.D. Thesis Defense Announcement: Han Wang: Knowledge Base Construction from Scientific Literature
November 14, 2016
Ph.D. Thesis Defense Announcement

Knowledge Base Construction from Scientific Literature

Han Wang

Wednesday, November 16, 2016
Winslow Building Room 1140

Knowledge Bases (KBs) have become a functional utility as a repository of information for both humans and software agents to seek confirmed facts about the world. With the wide-ranging application of KBs, automatically constructing either generic KBs or domain-specific KBs using information extracted from multiple sources such as web pages, reports, and research papers has grown into an interesting task for both academia and industry.

This dissertation presents SciKB, an end-to-end Knowledge Base Construction system, which takes in a collection of research articles within a certain scientific domain and outputs a domain-specific KB. The resultant KB contains fact triples extracted from the input documents as well as hierarchical clusters of the entities and relations involved in the facts. Each cluster aggregates entities or relations with similar semantic meanings, and the hierarchies serve as an implicit schema of the KB.

SciKB adopts an open information extraction approach to extract fact triples from the input documents, then jointly learns the distributed representations of the involved entities and relations in an unsupervised fashion, and finally utilizes the obtained representations to organize the entities and relations into hierarchical clusters. Experiments are conducted to evaluate each component of the SciKB pipeline and the results demonstrate its effectiveness in two scientific domains: Biomedical Science and Earth Science.
TWed Discusion: How the Semantic Web Was Won and Never OneTWed Discusion: How the Semantic Web Was Won and Never One
November 8, 2016
TWed Discussion, Wednesday, November 09, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as RPI STS PhD candidate Lindsay Poirier leads us in a fascinating re-telling of the history of the Semantic Web, considering how cultural values and styles of thinking have influenced the design of the Web, and how historical and ethnographic studies of Web architecture can contribute to Web Science.
Ph.D. Thesis Defense Announcement: Kristine Gloria: Imprudence of Reason: An Examination of Privacy ExpectationsPh.D. Thesis Defense Announcement: Kristine Gloria: Imprudence of Reason: An Examination of Privacy Expectations
November 1, 2016
Ph.D. Thesis Defense Announcement

Imprudence of Reason: An Examination of Privacy Expectations

Kristine Gloria

Tuesday, Nov 01, 2016
Winslow Building Room 1140

Modern day digital technologies and privacy are at odds. For over a decade, this debate over privacy rights has revolved around the divergence between our collective understanding of its value in society and our individual ability to protect it. From recent massive data breaches (e.g.Anthem, Home Depot, etc.) to unauthorized government surveillance, consumer privacy protection is plagued by violations. Yet, the amount of personal data online continues to increase. This mismatch motivates the following dissertation work. To address this, I begin by critically reviewing privacy's framing problem, which I contend places too much confidence on the notion of expectations and individualized control. I explore this further in a qualitative study comprised of an online survey and semi-structured interviews. Findings were then used to derive a proposed two-mode consumer model: blind faith and survivalists. With these modes, I consider alternative interpretations of privacy, including an extension to boyd's theory of networked privacy. I conclude with a discussion for a reflexive practice to evaluate methodological approaches within privacy research that may impact future public policymaking.
TWed Discussion: Jupyter Notebook: A collaborative data science environmentTWed Discussion: Jupyter Notebook: A collaborative data science environment
October 2, 2016
TWed Discussion, Wednesday, October 05, 2016, 6pm ET, Beta Classroom, Folsom Library on the RPI Campus

Please join us as TWC PhD students Ahmed Eleish and Anirudh Prabhu guide us on an "excellent adventure": a deep-dive into the practical application of Jupyter notebooks for conducting collaborative data science in various scientific domains.
TWed Discussion: Demonstrating Wireless Sensor Networks using NS2TWed Discussion: Demonstrating Wireless Sensor Networks using NS2
September 24, 2016
TWed Discussion, Wednesday, September 28, 2016, 5:30pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student Nidhi Rastogi leads us in a discussion of the use of NS2 [1] as a means for simulating wireless sensor networks, and its potential application to her anomaly detection research.
TWed Discussion: Tricks of the Trade: An Introduction to Deep LearningTWed Discussion: Tricks of the Trade: An Introduction to Deep Learning
September 18, 2016
TWed Discussion, Wednesday, September 21, 2016, 7:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student Matt Klawonn leads us in what promises to be a fascinating, practical introduction to deep learning using a number of different frameworks.
TWed Discussion: Remember the important things: semantic importance in stream reasoningTWed Discussion: Remember the important things: semantic importance in stream reasoning
August 26, 2016
TWed Discussion, Wednesday, August 31, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us for our first TWed Talk of the Fall 2016 season as sixth-year TWC PhD student Rui Yan provides an update on his stream reasoning research, with a particular focus on his recent progress in stream window management.
Ph.D. Thesis Defense Announcement: Simon Ellis: Cognitive Gameplaying: Playing Games with Cognitive ComputingPh.D. Thesis Defense Announcement: Simon Ellis: Cognitive Gameplaying: Playing Games with Cognitive Computing
July 18, 2016
Ph.D. Thesis Defense Announcement

Cognitive Gameplaying: Playing Games with Cognitive Computing

Simon Ellis

Monday, July 18th, 2016
Winslow 1140

The analysis of games and game-playing has long been a mainstay of research in the field of artificial intelligence (AI). From the first development of game theory by Morgenstern and von Neumann and Samuels' creation of a program to play the game of draughts at a level sufficient to challenge a human to the present day, much work has been undertaken to improve the ability of a computer to comprehend, analyse and play a game at a human level. Much of this work has been in a class of games readily tractable to game-theoretical approaches, such as chequers, Othello and chess; many other games exist which by nature or design resist standard AI techniques. However, in 2011 came a demonstration of a computer agent not only playing but winning - and winning resoundingly - at such a game: the game was the renowned quiz show Jeopardy!, and the computer was IBM's 'cognitive computer', Watson.

The general method of game-playing in AI is to generate a move tree, representing each move and its possible subsequents in the computer's memory, and score this as the tree is developed; the child node of the root with the highest overall score thus represents the 'best' or 'optimal' next move. In contrast to this, Watson uses a set of small analytics systems, which IBM calls 'annotators', to add separate pieces of metadata to a common data structure as it passes down a process pipeline. Each annotator may be called as required if and when new data are presented, and as the analysis continues further sets of data are added. At the end of the pipeline there may additionally be a machine learning (ML) system, neural net or other subsystem which implements a form of 'artificial intuition'. This modular approach, in which multiple sub-components add information to the dataset for further analysis later, was fundamental to Watson winning Jeopardy! in 2011 and also provides an excellent opportunity to create much more varied and subtle computer opponents.

I present herein an alternative approach from 'traditional' game AI, derived from the concepts which underpin Watson's technology, and demonstrate its usefulness in creating agents which can both play more complex, less game-theoretical, games and use a simple strategy to augment their move selection. This is the first tabletop (board, card, or role-playing) game AI to use the concepts and methods of cognitive computing in general and Watson in particular. The system will be demonstrated using the game Infinite City, published by Alderac Entertainment Group, a tile-based game in which players attempt to control sections of a 'city' represented by those tiles. Infinite City is highly mutable and can rapidly become combinatorially explosive, rendering it significantly less amenable to traditional game search methods.
Patrick West accepts Senior Software Engineer position at Entangled Media in Boulder, ColoradoPatrick West accepts Senior Software Engineer position at Entangled Media in Boulder, Colorado
July 1, 2016

The Tetherless World Constellation announces the departure of Principal Software Engineer Patrick West. Patrick has accepted a position with a startup company in Boulder, Colorado, Entangled Media, Inc. Patrick will be working on their application younity

Patrick has participated in many projects with the Tetherless World Constellation, most recently the Deep Carbon Observatory project, working closely with faculty, staff and students.

Xiaogang (Marshall) Ma Elected to IAMG CouncilXiaogang (Marshall) Ma Elected to IAMG Council
June 30, 2016
Xiaogang (Marshall) Ma, associated research scientist at Tetherless World Constellation, was elected to the Council of the International Association for Mathematical Geosciences (IAMG) to serve the period 2016 - 2020. The mission of the IAMG is to promote, worldwide, the advancement of mathematics, statistics and informatics in the Geosciences. Marshall is a life member of IAMG and has been active in the IAMG community since 2005. In 2015 he received the Vistelius Research Award from IAMG and was invited for a keynote talk "Geoinformatics in the Semantic Web" at the 2015 IAMG Annual Conference, Freiberg, Germany. As a council member his aim is to increase the visibility of geomathematics, geoinformatics and IAMG on the Web, promote geomathematics and geoinformatics among young researchers across the world, and broaden communication and advance collaboration between IAMG and other academic societies.
Ph.D. Thesis Defense Announcement: Evan Patton: Toward Energy-Aware Mobile Reasoning Agents for the Mobile Semantic WebPh.D. Thesis Defense Announcement: Evan Patton: Toward Energy-Aware Mobile Reasoning Agents for the Mobile Semantic Web
June 15, 2016
Ph.D. Thesis Defense Announcement

Toward Energy-Aware Mobile Reasoning Agents for the Mobile Semantic Web

Evan W. Patton
Department of Computer Science
Thesis Adviser: Professor Deborah L. McGuinness

Monday, June 20th, 2016
Winslow 1140 - 1:00 p.m.

Over the past decade there has been an uptake of semantic technologies on mobile devices. The hardness of semantic representation languages, such as OWL 2 DL's 2NEXPTIME upper bound, coupled with device and user constraints requires means of controlling expectation with respect to time, energy, and power use. In this talk, I present a hardware-based methodology for measuring for an Android smartphone the energy and power costs associated with the task of instance realization in OWL 2 knowledge bases across a number of OWL 2 reasoners of differing complexity. These findings are used to develop knowledge base metrics and predictive models that can be used to decide whether local or remote reasoning is a more efficient use of resources based on the available hardware. This is culminated into a framework called MEAR, the Mobile Energy-Aware Reasoner framework, and I show how predictive models for an OWL 2 RL reasoner built on this framework significantly decreases runtime, energy, and power consumption in the median case.
TWed Lightning Talks Spring 2016TWed Lightning Talks Spring 2016
May 9, 2016
TWed Lightning Talks, Wednesday, May 11, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Plan to join us for a very special TWed as the Tetherless World Constellation holds its end-of-term Graduate Research "Lightning Talks" TWed session. This special TWed is a great way for the TWC community to learn of the wide range of amazing research happening at the Tetherless World, and "a good time is had by all!"
TWed Discussion: Dissecting Datathons: Past, Present, FutureTWed Discussion: Dissecting Datathons: Past, Present, Future
May 2, 2016
TWed Discussion, Wednesday, May 04, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as Rensselaer IDEA operatives Tom Morgan and John Erickson host a discussion of datathons past and future, including a retrospective on the 2016 RPI IDEA Datathon held this past weekend (30 Apr - 01 May).
TWed Discussion: Implementing Data-driven Bioinformatics in SemNExtTWed Discussion: Implementing Data-driven Bioinformatics in SemNExt
April 19, 2016
TWed Discussion, Wednesday, April 20, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC undergrad superstar Spencer Norris leads us in an overview of SemNExT from the perspective of his contributions to the software infrastructure.
Using Information Centrality for anomaly detection in large networksUsing Information Centrality for anomaly detection in large networks
April 4, 2016
TWed Discussion, Wednesday, April 06, 2016, 7:00pm ET, Winslow Building on the RPI Campus

Please join us as PhD student Nidhi Rastogi leads us in an update on her interesting research applying the concept of "information centrality" to the problem of detecting cyber attacks in large networks.
TWed Discussion: Towards Liberal Information Extraction: A Study on Event ExtractionTWed Discussion: Towards Liberal Information Extraction: A Study on Event Extraction
March 26, 2016
TWed Discussion, Wednesday, March 30, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as PhD student Lifu Huang leads us in what is sure to be an interesting discussion of part of his PhD research, extracting events and discovering event schemas from arbitrary input corpora.
TWed Discussion: Schema- and Data- Aware Querying in Heterogeneous Knowledge GraphsTWed Discussion: Schema- and Data- Aware Querying in Heterogeneous Knowledge Graphs
March 23, 2016
TWed Discussion, Wednesday, March 23, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student Amar Viswanathan leads us in what will be an interesting discussion of his PhD research exploring query failure and his unique solution applying the Gricean maxim of "cooperative answering."
CARGO Brings Rensselaer Expertise to Cancer ResearchCARGO Brings Rensselaer Expertise to Cancer Research
March 23, 2016

New Cancer Research Group aims to accelerate development of new ways to detect and treat cancer

Through its new Cancer Research Group (CARGO), Rensselaer is drawing on its trademark interdisciplinary approach to research and discovery to help battle a disease that kills nearly 600,000 Americans per year and affects countless more.

CARGO includes 12 of the Institute’s leading researchers in disciplines as diverse as mechanical engineering, biology, biomedical engineering, computer science, and cognitive science, including Tetherless World Constellation Professor Deborah L. McGuinness. The group was established last fall, just months before President Barack Obama announced the National Cancer Moonshot initiative to accelerate the development of new ways to detect and treat cancer.

For more information see RPI Announcement

TWed Discussion: Semantic Workflows: Capturing and Using Provenance from Scientific WorkflowsTWed Discussion: Semantic Workflows: Capturing and Using Provenance from Scientific Workflows
March 8, 2016
TWed Discussion, Wednesday, March 09, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student John Sheehan leads us in a discussion of semantic workflows, the intersection of scientific workflows and semantic technologies including provenance. This work is becoming increasingly important as funding organizations such as the NIH focus on "rigor and reproducability" in grant applications
TWed Discussion: Ontology and LIMS-based Laboratory Data IntegrationTWed Discussion: Ontology and LIMS-based Laboratory Data Integration
February 28, 2016
TWed Discussion, Wednesday, March 02, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student Robin Liu leads us in a discussion of his work in using ontologies as a basis for laboratory data integration, including the application of LIMS and Hadatac
TWed Discussion: From Codebooks to Ontologies: Fun with SpreadsheetsTWed Discussion: From Codebooks to Ontologies: Fun with Spreadsheets
February 20, 2016
TWed Discussion, Wednesday, February 24, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student Katie Chastain leads us in a discussion of her recent work on automatically transforming descriptions of domain data, in the form of data dictionaries, codebooks and other artifacts, into formal ontologies and ontology extensions.
TWed Discussion: Semantic importance in cache-enabled stream reasoning systemsTWed Discussion: Semantic importance in cache-enabled stream reasoning systems
February 10, 2016
TWed Discussion, Wednesday, February 10, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as Rui Yan leads us in a discussion of his interesting recent work on creating a cache-enabled, order-aware, ontology-based stream reasoning framework.
TWed Discussion: Stupid Entity Linker Tricks: Top 10 Reasons You Should Try LinkipediaTWed Discussion: Stupid Entity Linker Tricks: Top 10 Reasons You Should Try Linkipedia
January 31, 2016
TWed Discussion, Wednesday, February 03, 2016, 6:00pm ET, Winslow Building on the RPI Campus

Please join us this as TWC Data Guru Jim McCusker leads us in an interactive discussion, demo and tutorial about Linkipedia, a powerful entity linking tool developed at TWC by former TWC grad student Jin Zheng. Linkipedia links concept mentions in textual document to entities on the "Web of Data," informed by ontologies.
Ph.D. Thesis Defense Announcement: Linyun Fu: Automatic Provenance Capturing for Research PublicationsPh.D. Thesis Defense Announcement: Linyun Fu: Automatic Provenance Capturing for Research Publications
December 1, 2015
Ph.D. Thesis Defense Announcement

Automatic Provenance Capturing for Research Publications

Linyun Fu
Department of Computer Science
Tetherless World Constellation
Thesis Advisor: Professor Peter Fox

Tuesday, December 01, 2015
Winslow 1140 – 11:30 a.m.

Provenance is critical for research publication readers to correctly interpret important content and enables them to evaluate the credibility of the reported results by digging into the software in use, source and change of data and responsible agents. It also would enable the reader to reproduce the scientific conclusions by following or adapting the process leading to the reported results. However, the creation of proper provenance for research publications may cost the authors a lot if they lack the necessary knowledge and technical support. First, it requires knowledge of proper logical provenance information to capture for the report creating process, causing extra learning overhead on the authors. Second, it may also require technical knowledge of the physical configurations of the program(s) execution platform such as the operating system or even the computer hardware, in order to obtain useful provenance information for the purpose of reproducibility and validation of the content. This usually entails even more learning overhead. Even if the authors already know what provenance should get recorded and how to record it, the actual recording work is usually distracting to the authors focusing on authoring the research publications and thus insufficiently motivated.

Existing frameworks and systems for capturing provenance for computational experiments are either specifically tailored for scientific workflow systems or based on a model that is not detailed enough for reproduction of the published results. Authors who are not familiar with any workflow system need to learn how to use one of these systems in order to create provenance that is detailed enough for reproducibility with them.

In this thesis, we specify a paradigm of preparing research publications based on invocation of operations to overcome many of the challenges associated with provenance capture mentioned above. The paradigm is to create publications on a portable provenance aware platform that transparently captures the proper provenance information. The PROV-PUB-O ontology was created for capturing proper knowledge of provenance for authoring processes based on invocations of operations, as well as describing and locating the published results in research publications. To evaluate the usability of PROV-PUB-O, we created the Ontology Usability Scale (OUS), which is the first set of metrics for ontology usability evaluation.

The provenance capture framework enabling the paradigm that fulfills the following requirements will be elaborated. First, the provenance captured must be stored in a way that the reproducibility of the reported results can be decided and the "false paths" can be found in the provenance graph that caused a certain result to not be reproducible. Second, the authoring platform must use a front end supporting a variety of programming languages/modes used by real researchers to create results. The objective is to keep the learning overhead to a minimum. Third, it is also required that the capture of provenance needs no or minimal involvement of the users. A prototype platform is implemented to demonstrate the specified framework. Chapter 4 of the 2014 U.S. National Climate Assessment report (NCA2014) is our use case and the reproduction enabling provenance of tables and figures in this chapter is shown to be captured by the prototype.
TWed Discussion: Pragmatic Query Reformulation in Knowledge GraphsTWed Discussion: Pragmatic Query Reformulation in Knowledge Graphs
November 25, 2015
TWed Discussion, Tuesday, December 01, 2015, 7:00pm ET, Winslow Building on the RPI Campus

Please join us as TWC grad student Amar Viswanathan leads us in a discussion of his recent work with Prof. Jim Hendler and Geeth De Mel of IBM on query reformulation. Amar recently presented this work as a poster at the IBM Cognitive Computing Symposium and has submitted to WWW 2016 and AAAI-16, and will also present his early results at the AAAI-16 Doctoral Consortium.
Twed Discussion: Biography-Dependent Collaborative Entity Archiving for Slot FillingTwed Discussion: Biography-Dependent Collaborative Entity Archiving for Slot Filling
November 6, 2015
TWed Discussion, Monday, November 09, 2015, 12:00pm ET, Winslow Building on the RPI Campus

Please join us as postdoc Yu Hong leads us in a discussion of his work as a postdoc with Heng Ji on improving entity-oriented automatic relevant document acquisition. This is particularly interesting and relevant to those of us working to connect knowledge graphs representing domains of research (think: DCO, HBGDki, etc) with artifacts in digital repositories and other corpora.
TWed Discussion: DCO Data Science and Thermodynamic Data RescueTWed Discussion: DCO Data Science and Thermodynamic Data Rescue
October 26, 2015
TWed Discussion, Tuesday, October 27, 2015, 6:00pm ET, Winslow Building on the RPI Campus

Please join us as PhD student Hao Zhong, together with associate research scientist Xiaogang (Marshall) Ma, leads us in a discussion of the current work on thermodynamic data rescue as a boundary activity of the DCO Data Science team.
Ph.D. Thesis Defense Announcement: Dominic DiFranzo: The Semantic eHumanties Methodology:  Same but DifferentPh.D. Thesis Defense Announcement: Dominic DiFranzo: The Semantic eHumanties Methodology: Same but Different
October 22, 2015
Ph.D. Thesis Defense Announcement

The Semantic eHumanties Methodology: Same but Different

Dominic DiFranzo
Department of Computer Science
Tetherless World Constellation
Thesis Advisor: Professor James A. Hendler

Thursday, October 22, 2015
Winslow 1140 – 12:00 p.m.

The Empirical Humanities, which includes the work of history, folklore and cultural anthropology, are facing new challenges. For decades they have been an almost entirely individual-centric enterprise. Field notes, observations, collected artifacts, photos, videos, and other cultural data are very rarely shared, except when reduced or rendered into some form of publication or museum display. As these researchers investigate more complex open systems that span many disciplines and languages, they are increasingly finding the need to collaborate across and within their field of study. This, along with new funding requirements, presents the need to share and archive primary collected cultural data. These requirements present new challenges in citing, revealing, sharing and reusing the often invisible work of Empirical Humanities research (i.e. creating templates, questions, methods, protocols, etc.).

Humanities scholars need a digital platform that will encourage and facilitate collaboration and allow for experimentation with diverse analytic models. The system must also provide a place to store, share and manage the primary data generated by these scholars. This digital collaborative platform could also provides an opportunity to experiment with new forms of peer review for humanities research, and could be used to develop and evaluate new, digitally-enabled genre forms.

To develop such a system, computer scientists and empirical humanities researchers will need to find sustainable ways to plan, design and build together. Past projects in the digital humanities and social sciences have often developed without sufficient involvement of practicing humanities researchers, resulting in systems that aren’t used. With this in mind I turn to the Semantic eScience Methodology. The Semantic eScience Methodology has been developed to help researchers collaboratively build digital infrastructure for the natural sciences by focusing on use cases, formal evaluation, semantic modeling, and rapid prototyping. This methodology has been used successfully in a wide array of science projects to design, build and maintain digital infrastructure and tools.

The main aim of this thesis is to test whether the Semantic eScience Methodology can be used to build digital infrastructure for the empirical humanities, particularly in experimental ethnography. As it currently exists, the Semantic eScience Methodology has only been used in quantitative natural science projects. This thesis explores and explains the philosophical and epistemological assumptions of the Semantic eScience Methodology and highlights the different needs and challenges that experimental ethnography places on digital infrastructure. To test this aim, I used the Semantic eScience Methodology to develop a digital platform for experimental ethnography called PECE (Platform for Experimental and Collaborative Ethnography). With this work, I have shown that not only can the Semantic eScience Methodology be used in the context of the empirical humanities, but that many of the tools and technologies used in past eScience projects can also successfully be reused as well. The key difference was how these technologies were used. One of the main outcomes of this thesis is a proposal for a new Semantic eHumanities Methodology that extends the Semantic eScience Methodology by taking into consideration the needs and challenges of experimental ethnography. I have also produced a completed and shareable PECE distribution that has been used and evaluated by empirical humanities scholars in the field.
TWed Talk: Reading Her Mind: Automatic segmentation, recognition and translation of Nyushu scriptTWed Talk: Reading Her Mind: Automatic segmentation, recognition and translation of Nyushu script
October 5, 2015
TWed Discussion, Wednesday, October 07, 2015, 11:00am ET, Winslow Building on the RPI Campus

Please join us as PhD student Tongtao Zhang leads us in a discussion of his work to automate the process of preserving the endangered Nyushu language.
Mount Sinai and Rensselaer Team Up to Earn Prominent Role in New NIH Program in Environmental and Childrens HealthMount Sinai and Rensselaer Team Up to Earn Prominent Role in New NIH Program in Environmental and Childrens Health
September 30, 2015
To support its groundbreaking work in the emergent field of “exposomics,” the National Institutes of Health (NIH), through the National Institute of Environmental Health Sciences, awarded two grants to research teams from the Department of Preventive Medicine at the Icahn School of Medicine at Mount Sinai, led by Professor and Chair Robert O. Wright, MD, MPH, and by Susan Teitelbaum, PhD, in collaboration with Deborah McGuinness, PhD, of Rensselaer Polytechnic Institute (RPI).
TWed Talk: Stream Reasoning: Where Stream Processing and Semantic Reasoning MeetTWed Talk: Stream Reasoning: Where Stream Processing and Semantic Reasoning Meet
September 26, 2015
TWed Discussion, Monday, September 28, 2015, 11:00am ET, Winslow Building on the RPI Campus

Please join us as TWC PhD student Rui Yan leads us in a discussion of stream reasoning, a fascinating area of research that combines stream processing and semantic reasoning.
Vistelius Research Award for Marshall X MaVistelius Research Award for Marshall X Ma
September 10, 2015
Marshall X Ma received the Andrei B. Vistelius Award from the International Association for Mathematical Geosciences (IAMG). The biennial award, established in 1980, is "presented to a young scientist for promising contributions in research in the application of mathematics or informatics in the earth sciences." Marshall was also invited for a keynote award presentation at the IAMG Annual Conference held at Freiberg, Germany. The paper of his presentation is accessible at: https://www.researchgate.net/publication/281414025_Geoinformatics_in_the_Semantic_Web .

Marshall is an associate research scientist of semantic eScience at Tetherless World Constellation, Rensselaer Polytechnic Institute, USA. He received his Ph.D. degree of Earth Systems Science and GIScience from University of Twente, Netherlands. His research interests include participatory conceptual modeling, data sharing in the semantic web, crowd-sourcing geoinformation, and spatio-temporal analysis of Big and Little Data. Since 2001, Ma has been an investigator on a range of scientific projects focusing on geoscience data management and their service and processing in the Web.
TWed Talk: HADataC: Human Aware Data Collection FrameworkTWed Talk: HADataC: Human Aware Data Collection Framework
September 7, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Discussion, Wednesday, September 09, 2015, 11:00am ET, Winslow Building on the RPI Campus

Please join us as Paulo Pinheiro leads us in an informal discussion and demonstration of the Human Aware Data Collection Framework (HADataC*) a compelling data management infrastructure that has emerged from TWC's engagement in the Jefferson Project at Lake George.
TWed Talk: Cool Tools for Research Project Management and CollaborationTWed Talk: Cool Tools for Research Project Management and Collaboration
August 29, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Discussion, Wednesday, September 02, 2015, 11:00am ET, Winslow Building on the RPI Campus

Please join us as recent TWC Ph.D. graduate Jim McCusker kicks off our Fall 2015 season with an informal discussion of "Cool Tools for Research Project Management and Collaboration." Jim will speak from his background as a consultant and more recent experiences bringing novel tools to work on a Bill and Melinda Gates Foundation-sponsored project. Jim will discuss and demonstrate tools such as Trello, Rocket.Chat, github, and similar services that support Agile methods applied to system development supporting research.
Rensselaer Polytechnic Institute Launches Initiative on Healthy Birth, Growth, and Development Knowledge: Semantic and Data Analytic SupportRensselaer Polytechnic Institute Launches Initiative on Healthy Birth, Growth, and Development Knowledge: Semantic and Data Analytic Support
August 13, 2015
Rensselaer Polytechnic Institute has received a grant from the Bill & Melinda Gates Foundation for the purpose of supporting the Healthy birth, growth and development (HBGD) initiative integration of multidisciplinary data to understand more fully the effects of risk factors on growth outcomes and to develop effective solutions that improve worldwide child health.

Rensselaer technical leader for the project is Deborah McGuinness, Tetherless World Senior Constellation Professor, director of the Rensselaer Web Science Research Center, and a member of The Rensselaer Institute for Data Exploration and Applications (IDEA) senior leadership committee.
Fox named first AGU Fellow in InformaticsFox named first AGU Fellow in Informatics
July 28, 2015
Prof. Peter Fox named as American Geophysical Union Fellow. Today the the 2015 American Geophysical Union (AGU) Class of Fellows were announced. Being elected a Union Fellow is a tribute to those AGU members who have made exceptional contributions to Earth and space sciences as valued by their peers and vetted by section and focus group committees. This honor is bestowed on only 0.1% of the membership in any given year. Fox is the first AGU fellow in the Earth and Space Science Informatics section. https://eos.org/agu-news/2015-class-of-agu-fellows-announced RPI news release to follow.
Ph.D. Thesis Defense Announcement: Joshua Shinavier: Light, Sound, and Semantics: The Web of Data as a New Sensory ModalityPh.D. Thesis Defense Announcement: Joshua Shinavier: Light, Sound, and Semantics: The Web of Data as a New Sensory Modality
July 6, 2015
Ph.D. Thesis Defense Announcement

Light, Sound, and Semantics: The Web of Data as a New Sensory Modality

Joshua Shinavier
Department of Computer Science
Tetherless World Constellation
Thesis Advisor: Professor James A. Hendler

Tuesday, July 7, 2015
Winslow 1140 – 10:00 a.m.

This dissertation explores the use of machine-accessible knowledge about the objects in our environment to augment our perception at an immediate, preconscious level.

In everyday life, we combine simultaneous natural stimuli, such as the sound of a voice and the sight of the speaker’s moving lips, into percepts without thinking about them individually. Artificial stimuli may be combined into the same percepts if they are semantically congruent with the perceptual context and if they arrive within brief temporal windows of the natural stimuli, among other conditions. Insofar as a knowledge-based system can recognize and respond to new context quickly and appropriately enough, its feedback may offer an advantage over natural signals alone, as it allows us to draw attention to nonphysical and non-obvious properties of the world, such as abstract relationships.

In order to truly extend a person’s natural senses, we need to understand the psychophysics of semantics and perception as well as the technological challenges of building such a system. The main focus of this dissertation is on the latter set of problems. We will first translate the known perceptual constraints into a set of functional requirements, then introduce a concrete Semantic Web architecture which fulfills them. The architecture combines cooperative activity detection, a SPARQL-based complex event processor, a Linked Data client, and a body area network of sensing and feedback devices. A number of Semantic Wearable applications are provided as proofs of concept, and a simulation-based evaluation of the system is also described, illustrating the performance of the system, for non-trivial scenarios and at a significant scale, within its real-time constraints.

In this architecture, the Web of Data serves as a read-write repository of knowledge about people and objects; it is queried on demand and updated for new context with new knowledge, enabling a feedback loop of perception and interaction which is independent of any single environment.
Ph.D. Thesis Defense Announcement:Yu Chen: Context Modeling with Inertia Mobile SensorPh.D. Thesis Defense Announcement:Yu Chen: Context Modeling with Inertia Mobile Sensor
May 15, 2015
Ph.D. Thesis Defense Announcement

Context Modeling with Inertia Mobile Sensor

Yu Chen
Department of Computer Science
Thesis Advisor: Professor Peter Fox

Monday, May 18, 2015
Winslow 1140 Conference Room – 12:00 p.m.

Mobile sensors have been around for decades and the number of different kinds is increasing rapidly. With the ubiquitous sensors in public facilities, home surveillance equipments and personal mobile devices, there is a great opportunity to leverage those sensors to expand the horizon of human’s sensitivity to understand the surroundings as well as each individual in a better way. However, mining the time series data produced by those sensors requires lots of domain knowledge and skills in signal processing, data mining and machine learning techniques, which are not everyone’s expertise. Therefore, in order to make full use of the sensor data and understand the environment and human, it is vital to have a system which is efficient, scalable and reusable, that is capable of analyzing and gaining knowledge from time series data produced by various kinds of sensors.

In this dissertation, the first part of the work focuses on developing motif detection algorithm to extract time series data patterns efficiently in a scalable approach. The second part of the work is to demonstrate the practicability of the algorithm along with the time series data analysis system in real application of understanding different perspectives of human activity via inertial sensors on mobile devices. A real time human physical activity recognition web service is developed in understanding sensor data produced by mobile phones. The capability of the system has also been demonstrated via a hacking system that is able to detect and recover user’s virtual keyboard input on mobile phone by sampling and analyzing data from background running accelerometer and gyroscope without direct access to user’s touch screen. The system has also been further evaluated under a more constructive application. ”UbiKeyboard” has been developed to detect and predict user’s intentional input by analyzing patterns in time series data generated by a wearable smart-glove that is equipped with accelerometer and gyroscope. With the help of a web scale natural language model, the system is able to recognize user’s intentional input with even higher accuracy.
TWed Lightning Talks Spring 2015TWed Lightning Talks Spring 2015
May 11, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Lightning Talks, Wednesday, May 13, 2015, 7pm ET, Winslow Building Room 1140 on the RPI Campus

Please join us for a very special TWed as the Tetherless World Constellation holds its end-of-term Graduate Research "Lightning Talks" TWed session. This special TWed is a great way for the TWC community to learn of the wide range of amazing research happening at the Tetherless World, and "a good time is had by all!"
TWed Discussion: Using Graph Centralities for detecting AnomaliesTWed Discussion: Using Graph Centralities for detecting Anomalies
May 3, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Discussion, Wednesday, May 06, 2015, 7pm ET, Winslow Building on the RPI Campus

Please join us as TWC Ph.D. student Nidhi Rastogi leads us in a discussion of applying graph analytics, especially identifying node centralities, to the problem of combatting "noise" in large-scale data collection in applications such as detecting cyber attacks.
TWed Discussion: Exploring Scientific Data with Faceted Visualization Featuring SolrTWed Discussion: Exploring Scientific Data with Faceted Visualization Featuring Solr
April 14, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Discussion, Wednesday, April 01, 2015, 7pm ET, Winslow Building on the RPI Campus

Please join us as TWC visiting scholar Henrique Santos leads us in a hands-on introduction to Apache Solr, using real-life examples from the Jefferson Project!
TWed Discussion: A Hands-on Introduction to Big Data Analysis using HadoopTWed Discussion: A Hands-on Introduction to Big Data Analysis using Hadoop
March 27, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Discussion, Wednesday, April 01, 2015, 7pm ET, Winslow Building on the RPI Campus

Please join us as TWC postdoc Xiaohui Lu leads us in a hands-on introduction to big data analysis using Hadoop!
The Science of Magic: Rensselaer and Walt Disney Imagineering Research & Development, Inc. Advance the Frontiers of Cognitive ComputingThe Science of Magic: Rensselaer and Walt Disney Imagineering Research & Development, Inc. Advance the Frontiers of Cognitive Computing
March 19, 2015
An interdisciplinary team of researchers at Rensselaer Polytechnic Institute is collaborating with Walt Disney Imagineering Research & Development, Inc., part of the theme park design and development arm of The Walt Disney Company. Together, they are exploring how the cognitive computing technology being developed at Rensselaer can help enhance the experience of visitors to Disney theme parks, cruise ships and other venues.

...

Leading the project for Rensselaer is James Hendler, Tetherless World Senior Constellation Professor and director of The Rensselaer Institute for Data Exploration and Applications (IDEA). An expert in web science, Big Data, and artificial intelligence, Hendler said the collaboration with Walt Disney Imagineering Research & Development, Inc. is an important step forward for all of the data-related research taking place as part of The Rensselaer IDEA. Rensselaer faculty members Mei Si, assistant professor in the Department of Cognitive Science, and Heng Ji, the Edward P. Hamilton Development Chair and associate professor in the Department of Computer Science, will collaborate with Hendler on the project.
TWed Talk: Automating Semantic Metadata Collection in the FieldTWed Talk: Automating Semantic Metadata Collection in the Field
March 11, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Hackathon, Wednesday, March 11, 2015, 7pm ET, Winslow Building on the RPI Campus

Please join us as Laura Kinkead presents her thesis work, discussing important contributions she has made to the Jefferson Project at Lake George. This will include a live demo of an Android app Laura created.
Twed Hackathon: Hands-on Environmental Modelling with openModellerTwed Hackathon: Hands-on Environmental Modelling with openModeller
March 3, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWed Hackathon, Wednesday, March 04, 2015, 7pm ET, Winslow Building on the RPI Campus

This TWed will be a highly interactive, hands-on session in which Matt Klawonn will guide us through modelling with openModeller.
Integrating Relational Databases with the Semantic WebIntegrating Relational Databases with the Semantic Web
February 24, 2015
Dr. Juan F. Sequeda will be giving a talk in room 1140 of the Winslow Building on the RPI Campus Wednesday, February 25, 2015 at 1:00pm.

In this talk, Dr. Sequeda will provide an answer to the following question: Can a Relational Database be mapped to existing Semantic Web ontologies and act as a reasoner? He will present his system UltrawrapOBDA, an Ontology Based Data Access system comprising bidirectional evaluation, that is, a hybridization of query rewriting (backward chaining) and materialization (forward chaining). UltrawrapOBDA supports inheritance and transitivity.
TWed Discussion: Choose Your Own Path: A Journey Down the Rabbit Hole of PrivacyTWed Discussion: Choose Your Own Path: A Journey Down the Rabbit Hole of Privacy
February 16, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWeD Talk, Wednesday, February 18, 2015, 7pm ET, Winslow Building on the RPI Campus

Please join us as TWC Ph.D. student Kristine Gloria leads us down the "rabbit hole" in an open discussion of the present and future of privacy!
TWed Discussion: The Philosophy of Linked DataTWed Discussion: The Philosophy of Linked Data
February 1, 2015
"There's always something happening Wednesday evenings in the Tetherless World!"

TWeD Talk, Wednesday, February 04, 2015, 7pm ET, Winslow Building on the RPI Campus

Please join us for our second Spring 2015 TWed as TWC Ph.D. student Dominic DiFranzo leads us in an open discussion of "The Philosophy of Linked Data".
TWed Talk: Linked Data for Ocean Science News and MultimediaTWed Talk: Linked Data for Ocean Science News and Multimedia
January 20, 2015
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, January 21, 2015, 7pm ET, Winslow Building on the RPI Campus

Please join us for our first TWed of 2015 as we welcome a team of colleagues from the Woods Hole Oceanographic Institute (WHOI) for a wide-open discussion about "linking ocean data to science stories."
Rensselaer Professors Hendler, Gray, and Ji Receive 2014 IBM Faculty AwardsRensselaer Professors Hendler, Gray, and Ji Receive 2014 IBM Faculty Awards
December 19, 2014
IBM named Rensselaer professors James Hendler, Wayne Gray, and Heng Ji as recipients of the award. IBM said the competitive program recognizes the quality of a faculty member’s research program with IBM and the importance of that research to industry.
Matt Ferritto Successful Master's Thesis DefenseMatt Ferritto Successful Master's Thesis Defense
December 8, 2014
Matt Ferritto, a Masters student working on various projects within the Tetherless World Constellation, successfully defended his Master's Thesis "Semantically Matching Tools and Data Collection Content: A ToolMatch Use Case Extension" on November 24, 2014.

Congratulations Matt!

Abstract: The ToolMatch service was developed with the intent to provide data users with the means to match their data collections with a comprehensive list of useful, appropriate tools, and to provide data tool developers with data collections that will work with their tools. As such ToolMatch had an initial scope of two use cases, the first of which was the semantic matching of data collections with tools. This would allow data users to find and choose among a list of otherwise separate and potentially hard to find tools that could work with their data collections. The second (and more difficult) of these use cases was the converse: given a tool, semantically find what data collections that the tool can use. If the first use case is analogous to having nails and looking for a hammer, then the second use case can be compared to having a hammer and looking for nails. It is much more difficult to find data collections that may work with a given tool, since a tool user might not necessarily know what to look for. Using the ToolMatch service, a tool user could easily find a data collection to use with their tool. In both of these use cases, wasted time and effort searching for the correct tool or data collection can be reduced or avoided completely. The focus of this thesis will be on the implementation of these two use cases, as well as an extension of the first use case, where a data user with certain semantics for a given data collection (such as a domain model) can find tools that can be used with the content of that data. This is an important issue due to the fact that a certain data collection content may not be appropriate for a tool within a certain domain model. For example, rainfall or topographic data content that is part of a larger Hydrological model can be matched to tools that the model as a whole might not be able to match. This expands the scope of the initial use case in that data collection content requires stricter matching than just the characteristics of data collection. The requirements of this use case involve modification and expansion to the ToolMatch conceptual model and ontology to allow for semantic matching between data content and tools. These changes will also be reflected in the ToolMatch web service, which allows users to make add, update, or delete instances of the ToolMatch ontology without having to have a full understanding of ontologies.
TWed Lightning Talks Fall 2014TWed Lightning Talks Fall 2014
December 8, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Lightning Talks, Wednesday, December 10, 2014, 7pm ET, Winslow Building Room 1140 on the RPI Campus

Please join us for a very special TWed as the Tetherless World Constellation holds its end-of-term Graduate Research "Lightning Talks" TWed session. This special TWed is a great way for the TWC community to learn of the wide range of amazing research happening at the Tetherless World, and "a good time is had by all!"
TWed Discussion: Semantic Web: What's nextTWed Discussion: Semantic Web: What's next
December 3, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, December 03, 2014, 7pm ET, Winslow Building Room 1140 on the RPI Campus

Please join us for a *very* special TWed as Prof. Jim Hendler shares with us his "speculation" on the Semantic Web, schema.org, knowledge graphs and the future of search and data on the Web, with discussion to follow. This is one TWed you DO NOT want to miss!
TWC release version 1.3 of DCO Single Sign-OnTWC release version 1.3 of DCO Single Sign-On
December 2, 2014
On Tuesday, December 2, 2014 the Tetherless World Constellation's Deep Carbon Observatory Data Science Team released version 1.3 of the DCO Single Sign-On system.

Patrick West, John Erickson, Marshall Ma, Han Wang, and Yu Chen contributed to the release.

The release contained multiple bug fixes, new functionality and enhancements, and code that will pave the way for future enhancements such as a DCO User Management Dashboard and additional information collected from new users during registration.

The new features built upon linked data principles, tying assertions between systems and between people, organizations, communities, and interests. Information collected at registration are now pushed to the DCO Information Portal, which is based on VIVO. And that information is displayed within the DCO Community Portal, which is based on Drupal. All of this information is retrievable via web browser and is also machine accessible. The VIVO endpoint is available for querying of this new information.
TWed Talk: Exploring Threat modeling using Semantic Web Technologies: A Summer Internship at Raytheon BBN TechnologiesTWed Talk: Exploring Threat modeling using Semantic Web Technologies: A Summer Internship at Raytheon BBN Technologies
November 12, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, November 10, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as TWC Ph.D. student Nidhi Rastogi leads us in a discussion of her Summer 2014 internship at Raytheon BBN Technologies, focussing on threat modelling using Semantic Web technologies.
Inaugural ISCU-WDS Data Stewardship Award for Dr. Xiaogang MaInaugural ISCU-WDS Data Stewardship Award for Dr. Xiaogang Ma
November 5, 2014

TWC researcher Xiaogang (Marshall) Ma (http://homepages.rpi.edu/~max7/) received Inaugural ISCU-WDS Data Stewardship Award at SciDataCon2014, New Delhi, India, on Nov-04-2014.

The World Data System (WDS) of the International Council for Science (ICSU) supports long-term stewardship of quality-assured scientific data and data services across a range of disciplines in the natural and social sciences, and the humanities. The WDS Data Stewardship Award highlights exceptional contributions to the improvement of scientific data stewardship by early career researchers through their engagement with the community, academic achievements, and innovations. The award ceremony was held on Nov-04-2014, at SciDataCon2014, New Delhi, India, with coordination of the Committee on Data for Science and Technology (CODATA), the WDS, interdisciplinary committees of ISCUS, and the Indian National Science Academy. Marshall did not attend the ceremony in person due to visa issues, and he delivered an award lecture remotely. Slides of his lecture ‘Why Data Science Matters’ are accessible at: http://www.slideshare.net/MarshallXMa/why-data-science-matters.

Marshall is an associated research scientist at Tetherless World Constellation, Rensselaer Polytechnic Institute, working on Semantic eGeoscience. His research highlights a semantic eScience framework, which deploys Semantic Web methodologies and technologies to support data-intensive researches, especially those in Earth and space sciences. His earlier work on geoscience vocabulary encoding and visualization was applied to enhance the feature of online geoscience data service and lower the barrier for layman users. His recent work on provenance of global change information addresses a crucial need for transparent scientific workflows and credible scientific findings, as global change information becomes both more abundant and increasingly important. Moreover, well-curated provenance information also facilitates informed and rational policy and decision-making. Marshall’s work was used in the Global Change Information System (http://data.globalchange.gov) of the U.S. Global Change Research Program to enable provenance tracing.

TWed Talk: Live-coding musical agents: An introduction to the Max visual programming languageTWed Talk: Live-coding musical agents: An introduction to the Max visual programming language
November 3, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, November 03, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as RPI EMPAC Sr. Research Engineer and CogSci Ph.D. student Eric Ameres blows us away with a discussion and LIVE demonstration of his work combining live, interactive coding and intelligent agents in live musical performance! Cool visuals and loud music are promised...
Ph.D. Thesis Defense Announcement for James MichaelisPh.D. Thesis Defense Announcement for James Michaelis
October 21, 2014
The Tetherless World Constellation is proud to announce the successful completion of James Michaelis' Thesis Defense

Title: Towards a Methodology for Evaluation of Provenance-Based User Interfaces

Advisors: Deborah McGuinness, Jim Hendler

Abstract: The Merriam Webster English Dictionary defines provenance as: (i) the origin or source of something; (ii) the history of ownership of a valued object or work of art or literature. In its earliest usage, the provenance of physical objects - such as pieces of artwork - could be used to make assessments of their value. In more recent times, provenance has become an increasingly critical component for assessment of data in digital systems. From the perspective of digital artifacts, the World Wide Web Consortium's (W3C) Provenance Working Group defines provenance as: a record that describes the people, institutions, entities, and activities, involved in producing, influencing, or delivering a piece of data or a thing.

Particularly in the past decade, technologies for acquiring, recording, storing and representing provenance data have grown more sophisticated. Additionally, significant interest has been expressed by stakeholders of many digital systems in expanding their usage of provenance. However, in this time, limited work has been done to develop and rigorously evaluate tools for the exploration and analysis of provenance collections - particularly, for stakeholders with limited background in database querying languages and systems.

This dissertation aims to develop and validate a methodology for gauging usability of provenance-based user interfaces, taking the position that existing tools and techniques from multidimensional data analysis can be applied toward both the design of such a methodology as well as development of novice-friendly interfaces.

To advance this dissertation, three supporting contributions are made:

The ProvAnalytics Framework: This is a novel framework for exploring provenance record collections expressed as directed graphs. The core focus of ProvAnalytics is to provide a set of approaches for converting collections of provenance graphs into multidimensional datasets for subsequent review by interested users. Additionally, functionality is provided for generating synthetic provenance collections, intended to meet the needs of particular interface evaluations.

An Analysis and Classification for Provenance Querying Tasks: Currently, no benchmarks exist for gauging the usability of provenance-based systems for end users. This contribution seeks to build on prior computational benchmarks - particularly from the Provenance Challenge series - to establish a set of user-centric querying tasks oriented toward the multidimensional analysis paradigm.

A Proof-of-Concept Comparison of Querying and Presentation Interfaces: This contribution centers on demonstrating utility of the ProvAnalytics framework using a case-study based evaluation. Three interface configurations were compared in a 36 subject study, aimed at gauging performance across three types of established information retrieval tasks. Findings from this study are intended to demonstrate: (i) that novice users of provenance systems can quickly adapt to working with the developed tasks and data representations, and (ii) that statistically meaningful relationships can be obtained in support of routine usability hypotheses.
TWed Talk: UbiKeyboard: Using sensors and muscle memory to eliminate physical keyboardsTWed Talk: UbiKeyboard: Using sensors and muscle memory to eliminate physical keyboards
October 19, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, October 22, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as Tetherless Ph.D. student Yu "Momo" Chen leads us in a discussion and (possible) LIVE DEMO of some extraordinarily cool work he has been involved with in advancing gestural input technology.
Jefferson Project at Lake George Reaches New Milestones and Celebrates Year of Collaboration, Scientific and Technological AdvancesJefferson Project at Lake George Reaches New Milestones and Celebrates Year of Collaboration, Scientific and Technological Advances
October 19, 2014
The Jefferson Project at Lake George today announced new milestones in the multimillion-dollar collaboration that seeks to understand and manage the complex factors impacting Lake George, a pristine natural ecosystem and cornerstone of New York’s tourism industry. Through their groundbreaking partnership, Rensselaer Polytechnic Institute, IBM, and the FUND for Lake George have developed preliminary models of key natural processes within the watershed.
Rensselaer Professor Deborah McGuinness Collaborates on New $15 Million NSF Award for DataONE Environmental Science ProjectRensselaer Professor Deborah McGuinness Collaborates on New $15 Million NSF Award for DataONE Environmental Science Project
October 15, 2014
Earth to Data: Making Sense of Environmental Observations

The National Science Foundation (NSF) has awarded $15 million to a team of environmental and earth science data researchers, including researchers at Rensselaer Polytechnic Institute, who are providing tools and infrastructure that improve access to vast amounts of scientific data.
Automated Malware Analysis Through Virtualization: The Malware vs Anti-Virus Arms RaceAutomated Malware Analysis Through Virtualization: The Malware vs Anti-Virus Arms Race
October 12, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, October 15, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as Alexei Bulazel leads us in what promises to be a fascinating and unusual TWed Talk, in which he will discuss some of his extensive work in malware analysis and detection.
TWed Talk: Automatic Summarization of Customer Log DataTWed Talk: Automatic Summarization of Customer Log Data
September 22, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, September 24, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as Amar Viswanathan leads us in a review of his internship with IBM this summer, where he applied NPL principles to the practical but challenging problem of customer log data summarization!
TWed Talk: Capturing and Presenting Provenance of Global Change ResearchTWed Talk: Capturing and Presenting Provenance of Global Change Research
September 17, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, September 17, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as Marshall Ma leads us in what should be an interesting discussion on the importance of data provenance enablement in global change research and may ultimately play a role in policy and decision-making.
Ph.D. Thesis Defense Announcement for James McCuskerPh.D. Thesis Defense Announcement for James McCusker
September 8, 2014
The Tetherless World Constellation is proud to announce the successful completion of James McCusker's Thesis Defense.

TITLE: WebSig: A Digital Signature Framework for the Web

ADVISOR: Deborah McGuinness

ABSTRACT: WebSig is a digital signature scheme for the web that uses Resource Description Framework (RDF) graphs to express its documents, document metadata, and signature data in a way that leverages existing trustable digital signature schemes to create signatures that are both computable and trustable. We demonstrate this by showing how digital signature scheme that are attributable, verifiable, linkable, revisable, and portable, are also computable and trustable digital signature schemes. We also introduce evaluation criteria for those five qualities and demonstrate how WebSig provides all five. WebSig supports the verifiable signing of any RDF Graphs of Practical Interest (GPI) through the use of another contribution, the Functional Requirements for Information Resources (FRIR) information identity framework. FRIR is a provenance-driven identity framework that can provide interrelated identities for RDF graphs and other information resources. The FRIR Graph Digest Algorithm, a third contribution, provides an algorithm that can create platform-independent, cryptographically secure, reproducible identifiers for GPIs. FRIR and the FRIR Graph Digest Algorithm both supply the means to securely identify the signed document and any supporting RDF graphs, and are essential to supplying all five qualities needed to provide computable and trustable signatures. WebSig builds off of existing technologies and vocabularies from the domains of cryptography, computer security, semantic web services, semantic publishing, library science, and provenance.

This dissertation’s contributions will be presented as follows: 1) Sufficiency proof that attributable, verifiable, portable, linkable, revisable digital signature schemes are trustable and computable; 2) Functional Requirements for Information Resources (FRIR), a provenance-enabled, trustable, computable identity framework for information resources; 3) the FRIR RDF Graph Digest Algorithm, an algorithm that provides reproducible identifiers for Graphs of Practical Interest (GPIs), a class of graphs that we formally define; and 4) WebSig, a framework that lets users create legally-binding electronic documents that are both trustable and computable.
TWed Talk: Complicating the Social and the Technical Sciences of the Web: Provocations from Feminist and Postcolonial StudiesTWed Talk: Complicating the Social and the Technical Sciences of the Web: Provocations from Feminist and Postcolonial Studies
August 23, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, May 12, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us for our first TWed of the Fall 2014 term as Lindsay Poirier, an active member of our WSRC team, leads us in a discussion of how political and cultural decisions shape the Web as we know it, and the sorts of design considerations needed to produce a robust and inclusive web.
Congrats to RPI students Linyun Fu and Matt Ferritto for winning FUNDING FRIDAY research grant at the ESIP 2014 Summer MeetingCongrats to RPI students Linyun Fu and Matt Ferritto for winning FUNDING FRIDAY research grant at the ESIP 2014 Summer Meeting
July 11, 2014
TWC/ RPI PhD student Linyun Fu, and TWC/RPI Masters student Matt Ferritto, on Friday July 11, 2014, won ESIP 2014 Summer Meeting Funding Friday research grants. The ESIP 2014 Summer Meeting was held in Frisco, Colorado July 8-11, 2014.

Linyun, a joint proposal with TWC staff member Massimo Di Stefano, proposed a project to automatically capture provenance information during scientific workflow, followup work on the ECOOP project they have both been working on.

Matt proposed an extension to a use case that was developed, and is currently being worked on, as part of the ESIP ToolMatch project. The proposal is to make a more specific match between tools and specific data collection content, rather then the general use case of the data collection as a whole. For example, rainfall or topographic data content that is part of a larger Hydrological model can be matched to tools that the model as a whole might not be able to match. Congratulations to both for their hard work, and great presentations.
Faculty, staff, and students representing TWC at ESIP Summer Meeting 2014Faculty, staff, and students representing TWC at ESIP Summer Meeting 2014
July 6, 2014
Peter Fox, Professor; Patrick West, Stephan Zednik, and Massimo Di Stefano, Staff; and James Michaelis, Linyun Fu, and Matt Ferritto, Students; are represneting the Tetherless World Constellation at Rensselaer Polytechnic Institute at the ESIP Summer Meeting 2014 and co-located meetings DataONE User's Group and OPeNDAP Developer's Meeting. They will be presenting their work on various projects and research initiatives, co-leading various sessions throughout the week, and meeting with project colleagues.
Deep Carbon Observatory Data Science Day at RPI on June 05Deep Carbon Observatory Data Science Day at RPI on June 05
June 2, 2014
The Deep Carbon Observatory (DCO) will be hosting an Data Science Day symposium at RPI on 5 June 2014.

The DCO is mid way into a 10-year initiative to intensify global attention and scientific effort in the burgeoning field of deep carbon science, stimulated by funding from the Alfred P. Sloan Foundation. Activities include infrastructure development, scientific workshops, novel technology development, and exploratory research and fieldwork. Funding is also intended to catalyze collaborative scientific efforts around the world, increase public and private sector spending in deep carbon science, and leave a thriving community of international scientists as its legacy. The DCO Data Science activity is led by Rensselaer and reaches into the four DCO science communities (Deep Energy, Deep Life, Extreme Physics and Chemistry and Reservoirs and Fluxes) to advance research and educational aspects of data science and data management among all DCO participants.

The DCO Data Science Day, taking place at Bruggeman Center / CBIS Building on 5 June 2014, will be an opportunity for the DCO community to discuss data management and data science activities within a broader audience. Leading researchers in the fields of Earth science and data science, including Fran Berman, Mark Ghiorso, Kerstin Lehnert, Bruce Watson and Bob Hazen, will give plenary talks in the morning session of the day. The symposium is open to the RPI community and your attendance is welcome.

You an learn more by visiting the symposium agenda. For further information, please contact Xiaogang (Marshall) Ma
TWed Talk: WebSig: A Digital Signature Framework for the WebTWed Talk: WebSig: A Digital Signature Framework for the Web
May 14, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, May 12, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us for the final TWed of the Spring 2014 term as TWC PhD student Jim McCusker leads us in an a discussion of Web Signature (WebSig), a new digital signature framework Jim is developing as a component of his PhD work.
Stephan Zednik returns to TWC as Senior Software EngineerStephan Zednik returns to TWC as Senior Software Engineer
April 29, 2014
Peter Fox, Co-Chair of Rensselaer's Tetherless World Constellation, announced today the return of Stephan Zednik to TWC in the role of Senior Software Engineer.

"... it is my significant pleasure and honour to welcome Stephan Zednik back to Rensselaer's Tetherless World. He will return as Senior Software Engineer and will be working on the "kicking-butt" parts of many of our software developments, especially for production environments." - Peter Fox
TWeD Lightning Talks Fall 2014TWeD Lightning Talks Fall 2014
April 28, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWed Lightning Talks, Wednesday, April 30, 2014, 7pm ET, Winslow Building on the RPI Campus

DESCRIPTION: At the end of each term TWC holds a Graduate Research "Lightning Talks" TWed session. This special TWed event is a great way for the TWC community to learn of the wide range of amazing research happening at the Tetherless World, and "a good time is had by all!"

A lightning talk is a VERY short --- under 3 minute! --- summary by the researcher of their current research work, with NO SLIDES and only brief "crib notes!"
Xiaogang Ma promoted to Associate Research ScientistXiaogang Ma promoted to Associate Research Scientist
April 21, 2014
Rensselaer's Tetherless World Constellation is pleased to announce the promotion of Xiaogang (Marshall) Ma to Associate Research Scientist. The announcement was made April 21, 2014 by Professor Peter Fox, Co-Chair of the Constellation.

Marshall joined the Tetherless World in 2012 with a focus of research on Semantic e(Geo)Science. Before coming to RPI he was a PhD Candidate at Faculty ITC, University of Twente in the Netherlands, working on the theme 'Ontology Spectrum for Geological Data Interoperability'. He obtained his PhD degree in November 2011. He also has a B.Eng. degree of Land Resources Management and a D.Eng. degree of Geoinformatics Engineering, both are awarded by China University of Geosciences, Wuhan.

His research interests include geo-thesauri, geo-ontologies, geodata interoperability, geo-conceptual modeling, data visualization and geodata services with W3C® and OGC® standards. He is a member of the International Association for Mathematical Geosciences (IAMG), the Commission for the Management and Application of Geoscience Information of International Union of Geological Sciences (CGI-IUGS), and the European Geoscience Union (EGU). He is also affiliated with several commissions in the International Cartographic Association (ICA) and the International Society for Photogrammetry and Remote Sensing (ISPRS).

Marshall won the ESIP Funding Friday Competition Award twice (2012 as PI and 2013 as advisor and co-PI). He was a nominee for the IAMG Andrei Borisovich Vistelius Research Award for excellent works in geoinformatics and geomathematics (2013). Marshall was awarded the IAMG Graduate Student Research Grants Award in 2006 and travel grants from several sources for data science events in 2013 and 2014.
TWed Talk: Urban Sprawl Assessment Portal for Tetherless World ConstellationTWed Talk: Urban Sprawl Assessment Portal for Tetherless World Constellation
April 21, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, April 23, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us for a very special TWed as ITWS Capstone Team One leads us in an interesting discussion AND LIVE DEMO of their Spring 2014 Capstone project, a dynamic web application for reviewing factors contributing to urban sprawl. This project was sponsored by the Web Science Research Center (WSRC) of TWC RPI.

ABSTRACT:
The team was challenged to utilize open government data to create a mobile app that would promote positive social change by helping solve a local, community problem. Based on an evaluation of available datasets and consultation with the WSRC team, ITWS Capstone Team One chose to create a portal to help a variety of stakeholders interactively review factors associated with urban sprawl. The team identified datasets provided through the New York State open data portal [1] from relevant domains such as health, environment or education; applied Semantic Web and other technologies to combine and visualize datasets in compelling ways; used agile development techniques to deliver a mobile, interactive app; and applied Web Science principles to measure the effectiveness of their solution.

To complete this challenge Team One was expected to demonstrate a practical knowledge of data structures and application development, including web application development. During the development process specific skills such as mobile app design and development, the architecture of data-driven apps, the implementation of web apps using remote visualization APIs, and knowledge of Javascript, JSON, and the Semantic Web stack (RDF, SPARQL, principles of Linked Data) were utilized.
Patrick West promoted to Principal Software EngineerPatrick West promoted to Principal Software Engineer
April 21, 2014
Rensselaer's Tetherless World Constellation is pleased to announce the promotion of Patrick West to Principal Software Engineer. The announcement was made April 21, 2014 by Professor Peter Fox, Co-Chair of the Constellation.

Patrick joined the Tetherless World Constellation in 2009 as Senior Software Engineer. Before that, he worked with Peter Fox at the High Altitude Observatory at the National Center for Atmospheric Research in Boulder, Colorado.

His current projects are focused on the semantic expression of data science concepts and relationships in various domains, including solar, upper atmosphere, ocean sciences, earth science informatics, as well as computer science areas such as knowledge representation, semantic technologies, distributed semantic data frameworks, robust collaboration and content management environments, and agile software engineering. Patrick has also been a long-term contributor to the Hyrax OPenDAP software project. He has 23 years post-degree experience (Bachelor of Science from Indiana University) which includes large and small companies, startups, non profits, research organizations, and now academia.
TWeD Talk: PROV and OPeNDAPTWeD Talk: PROV and OPeNDAP
April 11, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, April 16, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as TWC Ph.D. student Tim Lebo leads us in an interesting discussion of the challenges of adding provenance to deployed systems, using his current work extending OpeNDAP with PROV!

ABSTRACT: Adding provenance to existing systems can benefit users, but comes at an expense that may be difficult for some to justify. This trade-off can be overcome by increasing the value of provenance, by decreasing the cost to add it - or by doing both.

This talk discusses a contribution for each. First, we develop further the W3C PROV pingback technique so that it may reach its potential to interconnect provenance records that would traditionally sit in isolation, thus increasing their value. Second, we reduce the expense to publish the provenance of existing host systems by using minimal coupling to the Prizms Linked Data platform. Using an Earth Sciences scenario and the OPeNDAP data transport architecture as an example host system, we investigate how PROV pingback could work in practice, demonstrate its potential, and identify outstanding issues that must be addressed before it can be widely adopted.
Ph.D. Thesis Defense Announcement for Jin ZhengPh.D. Thesis Defense Announcement for Jin Zheng
April 3, 2014
The Tetherless World Constellation is proud to announce the successful completion of Jin Zheng's Thesis Defense.

TITLE: Semantic Similarity Computation on the Web of Data

ADVISOR: Peter Fox

ABSTRACT: Over the last few decades, many efforts have been devoted to researching and developing effective semantic similarity computation algorithms for different scenarios, such as similarity between free texts, and similarity between objects. As the result of these efforts, there are many semantic similarity computation algorithms that utilize different information sources, for example, information content based algorithms like the vector space model; ontology based edge counting methods, like semantic similarity methods in WordNet; structure or feature based methods, like Tverskys model.

However, none of the existing algorithms are aimed to solve similarity computation problem for the entities on the Web of Data. Applying existing similarity computation algorithms for texts or words directly on entities on the Web of Data (WoD) would compute an inaccurate similarity score. The reason that these similarity computation algorithms cannot compute the score accurately for entities on WoD is that they are purely based on text analysis and did not utilize the rich semantic relations and semantic descriptions of the entities during similarity computation. Semantic similarity computation problem on entities of WoD is important, because there are many applications are relying on similarity computation, such as entity matching, entity annotation, or entity ranking.

The primary goal of this study is to investigate how to compute semantic similarity score among entities on the Web of Data. We design 1) a novel semantic similarity computation model to compute similarity among the entities on the Web of Data and other structured or unstructured data entities. The new similarity computation model leverages the theory of information entropy to determine the amount of meaningful information presented in the entity description nd thereafter compute the amount of meaningful information shared by the entities. The model uses machine learning approaches to learn and assign appropriate weights to shared or unique information of the entities in order to highlight important and meaningful information. The model also tackles scalability issue of the similarity computation which is a major challenge given the amount of entities on the Web of Data. To prove the effectiveness of proposed semantic similarity computation model, we 2) apply the model to develop systems to solve entity matching problem, and entity annotation problem on the Web of Data. We show that using our model, we can improve the current state of the art when solving these problems.
TWeD Talk: HTTPA: HTTP with AccountabilityTWeD Talk: HTTPA: HTTP with Accountability
March 31, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, April 02, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join as MIT CSAIL Ph.D. student Oshani Seneviratne leads us in a discussion of some of her current work investigating protocols for accountability on the Web!

ABSTRACT: We have developed HTTPA, an architecture for the Web to address complex issues arising from data reuse. These issues include privacy violations and intellectual property rights violations. This talk will present the motivation for HTTPA based on results from a policy awareness study and some initial tools such as the License Usage Validator, and the Semantic Clipboard. These tools are limited to a particular type of content reuse, i.e. image reuse. Therefore, we extended our work to include policy awareness on any resource on the Web using HTTPA. HTTPA is built on open Web standards and uses the 'Provenance Tracking Network' (PTN), an open global trusted network of peer servers that logs resource usage data. Websites that conform to the architecture communicate information about transactions for any sensitive data items with the PTN. These logs can later be queried to check compliance with individual, organizational, state or federal policy and usage restrictions that assert no unauthorized data transfer or usage has taken place. We have evaluated this architecture using an electronic healthcare records application called Transparent Health that gives patients a better sense of how their sensitive data has been used.
TWeD Talk: Context Modelling as a ServiceTWeD Talk: Context Modelling as a Service
March 18, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, March 19, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us as Tetherless Ph.D. student Yu "Momo" Chen leads us in a discussion of some of his current work investigating context modeling as a service.

ABSTRACT: In this talk Yu Chen will discuss some of his current research on context modeling as a service. As more and more portable devices, e.g. mobile phones, are equipped with sensors, there is a huge potential in understanding what can be revealed from patterns in the sensor data, as it is expected to be highly correlated to human activities and behaviors. However, this sensor data is not easy or intuitive to analyze, especially as the noisy time series data requires both intensive heuristics and mathematical analysis to reflect the real significance of the raw data. Based on his research, Yu argues that a web service that can be delegated by any application that interacts with sensors will be of great interest. In this talk, Yu will discuss work he has done that helps move this idea forward.
TWeD Talk: Experiences Curating Science Metadata and Recommendations for Publishing MetadataTWeD Talk: Experiences Curating Science Metadata and Recommendations for Publishing Metadata
February 28, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, March 05, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us Wed, 05 Mar for the return of Tetherless Ph.D. recipient Jesse Weaver as he leads us in a discussion of some of his current work at the Pacific Northwest National Labs on RDESC, "Experiences Curating Science Metadata and Recommendations for Publishing Metadata"

ABSTRACT: "Experiences Curating Science Metadata and Recommendations for Publishing Metadata"

At present, much science metadata is utterly inaccessible (i.e., not shared), digitally inaccessible (i.e., not on the Web), or machine-incomprehensible (i.e., text). Although standard vocabularies like GCMD keywords and CF standard names are a step in the right direction, much more is needed in order to bridge the semantic gap between the detail of science metadata and the generality of posed questions. As part of the RDESC project, we attempt to demonstrably bridge this gap for a specific atmospheric science use case by incrementally developing an OWL ontology to accommodate the precision of various metadata, and by curating the metadata into semantically rich, RDF triples. The ontology and RDF data model enable us to meaningfully related heterogeneous metadata of varying precision from different sources. In this talk, I will primarily discuss the metadata curation effort that has taken place to date in RDESC and make recommendations for how to improve on publishing science metadata.

RDESC is a DOE/ASCR-funded project in collaboration with RPI that aims to facilitate discovery of science resources at the scale of the scientific community. The project involves the curation of existing science metadata, the development of recommendations for publishing science metadata, and the development of a prototypical web interface for discovering resources described by the curated metadata.
TWeD Discussion: The Jefferson Project at Lake GeorgeTWeD Discussion: The Jefferson Project at Lake George
February 24, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, February 26, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us this week as Tetherless World research scientist Paulo Pinheiro da Silva leads us in a discussion of the Jefferson Project at Lake George, a joint project of Rensselaer, IBM, and the FUND for Lake George

ABSTRACT: The Jefferson Project at Lake George is building one of the world's most sophisticated environmental monitoring and prediction systems, which will provide scientists and the community with a real-time picture of the health of the lake. Launched in June 2013, the project aims to understand and manage multiple complex factors--including road salt incursion, storm water runoff, and invasive species--all threatening one of the world's most pristine natural ecosystems and an economic cornerstone of the New York tourism industry. In this talk, we will discuss opportunities and challenges for enhancing the management of large scale sensor data with Prizms, and for monitoring sensor data with SemantEco. With the help of simulation models, sensor data are used as predictors in support of environmental decision making. In the context of simulation models, we will discuss the use of provenance and semantic technology for managing simulation results.
TWeD Discussion: Coding Provenance in Software and Matching Tools to DataTWeD Discussion: Coding Provenance in Software and Matching Tools to Data
February 10, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, February 12, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us this week as the Tetherless World's Patrick West leads us in a discussion of his work with OPeNDAP Provenance Project and ESIP's ToolMatch Project.

ABSTRACT: OPeNDAP.org has been providing software solutions for the access, manipulation, transformation, and dissemination of science data for over a decade now. But it's been only recently that we have started thinking about providing information about exactly how and from what that final data product was generated. The OPeNDAP provenance project looks to research the coding of software systems to provide provenance information and implement that in OPeNDAP.

And given those original datasets and that generated data product, what client software can be used to visualize that data and in what ways. The ToolMatch project looks to formalize the expression of the datasets, and client tools that can visualize them in some way, developing an ontology and set of inference rules that can help the user realize the full potential of their data search and access.
TWeD Talk: Who are the Influencers? New Algorithms for Detecting Key Players in Social NetworksTWeD Talk: Who are the Influencers? New Algorithms for Detecting Key Players in Social Networks
January 27, 2014
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, January 29, 2014, 7pm ET, Winslow Building on the RPI Campus

Please join us this week as new TWC postdoc Xiaohui Lu leads us in what promises to be an interesting discussion of his work developing novel social network analysis algorithms.

ABSTRACT: One of the primary tasks of social network analysis is the identification of the influential actors in a social network. Centrality measures based on one's structural position, such as betweenness, closeness and degree centrality, are widely applied to various social networks for this purpose. However, these measures often suffer from prohibitive computational cost, non-intuitive assumptions, and limited applications. Meanwhile, with the explosive emergence and the widespread accessibility of online social network sites, large scale networks with multiple types of entities, such as author-publication, actor-movie, employee-email networks, are ubiquitous and readily available. However, due to size and multiple modes, classical centrality measures are helpless in such cases.

In this talk, I first present algorithms for pure social networks (actor-actor networks), then an algorithm for multi-mode networks. In pure social networks, centrality algorithms are good candidates. However, these centrality measures suffer from several issues - they either look solely at the structure of the network disregarding issues like attention nodes have to give to others or make a shortest path interaction assumption that might be impractical in large networks. Algorithms for pure social networks are not able to take advantage of abundant information hidden in multi-mode (heterogeneous) networks. I developed an algorithm to analyze such heterogeneous networks. The algorithm iterates from one type of objects to another, and importance of objects flow through these different types of edges. This algorithm is based on empirical observations - influential actors are likely to collaborate with influential others; good collaboration product tends to be in good groups.
Fox elected President of Earth Science Information Partners FederationFox elected President of Earth Science Information Partners Federation
January 9, 2014
Professor Peter Fox has been Fox elected President of Earth Science Information Partners Federation (ESIP; esipfed.org). Fox will serve a 2-year term as ESIP celebrates its 15th year with over 100 institutional/ project members under the theme of "Making Data Matter". He has previously served several terms as a member representative to the ESIP executive committee and the Foundation for Earth Science (ESIP's parent organization) board. Fox has also chaired the Semantic Web cluster activity since its inception (2005). TWC/RPI has been a member of ESIP since 2008.
TWC at AGU Dec. 9-13TWC at AGU Dec. 9-13
December 7, 2013
The Tetherless World Constellation will be well represented at the 2013 Fall Meeting of the American Geophysical Union (AGU), December 9 - 13. With 6 attendees and over 15 posters, presentations, and invited talks, as well as an Academic Booth in the Exhibit Hall (Booth 1316).

So if you're at AGU next week, stop by our booth, and check out the TWC Calendar of Posters/Presentations/Talks [Download]
Dr. Joanne Luciano to co-teach course on Semantic eHealth in IsraelDr. Joanne Luciano to co-teach course on Semantic eHealth in Israel
December 7, 2013
Dr. Joanne S. Luciano will be co-teaching a course with Dr. Eitan Rubin of Ben-Gurion University of the Negev titled "Semantic eHealth: getting more out of biomedical data using Semantic Technology", December 22-25, 2013 at Ben Gurion University of the Negev in Israel.

The course we will introduce a set of advanced tools that can be used to integrate bio-medical data and use it to answer clinical questions. The course introduces the new field of data science, with an emphasis on how it relates to biomedical research. It provides the knowledge of the standards and best practices that enable integration across the web and data mining at web scale.
TWeD Lightning Talks Fall 2013TWeD Lightning Talks Fall 2013
December 3, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Lightning Talks: Wednesday, December 4, 2013 7pm

Winslow Building, Room 1140, RPI Campus, Troy, NY

Towards the end of each semester we hold Lightning Talks during TWeD, a chance for researchers, staff, and students to briefly (3 minutes) talk about what they are doing in the lab.

This special TWed event is a great way for the greater TWC Community to learn of the wide range of amazing research happening at TWC, and "a good time (is) had by all!"
Rensselaer Professor Deborah L. McGuinness Named Fellow of the AAASRensselaer Professor Deborah L. McGuinness Named Fellow of the AAAS
November 25, 2013
Web scientist and Rensselaer Polytechnic Institute Tetherless World Research Constellation Professor Deborah L. McGuinness has been selected as a fellow of the American Association for the Advancement of Science (AAAS).
Ph.D. Thesis Defense Announcement for Xian LiPh.D. Thesis Defense Announcement for Xian Li
November 18, 2013
The Tetherless World Constellation is proud to announce the successful completion of Xian Li's Thesis Defense.

TITLE: Dynamics of Investor Attention on the Social Web

ADVISOR: Professor James Hendler

ABSTRACT: The World Wide Web has been revolutionizing how investors produce and consume information while participating financial markets. Both the amount of information and the speed it flows around have achieved unprecedented magnitudes. The most preeminent change is the existence of ever-growing investor communities on the social web, which give rise to multidimensional information channels in real time. One the other hand, as an information consumer, what is immediately impacted is investor attention. Like other valuable resources in the economy, investor attention is limited. Therefore, it is crucial to understand how investors allocate their attention resources and the corresponding impacts for the financial markets.

Leveraging statistical analysis of “big data” related to real investors, this dis- sertation investigates micro-structures, temporal dynamics, and market impacts of investors’ selective attention by analyzing their tweeting activities on the social web. A hierarchy of complex systems is studied ranging from individual investor’s cognitive processes at the micro level to economic outcomes at macro scales. The contribution of this thesis is composed of three parts, each of which is summarized as follows.

Contribution I investigates mechanisms of cognitive control in individual investor’s temporal selective attention. We develop formalisms of “cognitive niches”, i.e. interplays between heuristics from adaptive cognitive control, to account for the selectivity of investor attention. Utilization of these cognitive niches is validated by empirical observations of investors’ tweeting activities on assets. Such selective mechanisms are further shown to be contextual, depending on types of assets, investing experience as well as investing approach. Embedded in a highly connected social environment, investor attention is found to employ the “social proof” heuristic, and the drawing power of the crowd in directing investor attention is significant and exceeds that of salient exogenous stimuli, especially when uncertainty in the financial market is high.

Contribution II characterizes the dynamical system of collective investor attention on the social web. We identified stylized facts of collective cognition in terms of fluctuation and memory persistence. Temporal fingerprints left by collective investor attention share several common properties with other complex systems with strong heterogeneity and interactions, such as clustering and memory persistence. In spite of scale-invariant fluctuations and long-range correlations as we identified from empirical observations. However, these regularities are not uniform across assets but suggest multi-scaling. To explicitly model the feedback mechanisms in collective investor attention, we propose a stochastic branching process as a coarse-grained generative model, which is shown to be a good fit of empirical tweeting behaviors especially during busy trading hours. Such results not only highlight significant endogeneity, or self-reflexivity, within the system of collective investor attention, but also provide more quantitative and real-time measurements of investor attention on the social web.

Contribution III quantifies interactions between dynamics of investor attention on the social web and price movements in the financial market. First, we show that these two systems are significantly correlated at a variety of timescales especially at small and intermediate timescales. At the small timescale, we found feedback relationships between investors’ tweeting activity and ranges of price movements, suggesting behavioral causes of “volatility clustering”. Furthermore, we illustrate distinct magnitudes and relaxation patterns of volatilities conditioning on investor attention of different cognitive controls. At intermediate timescales, we identified bidirectional causal relationships between collective investor attention on the social web and trading activities on the market, including volatilities, returns and trading volumes. By disentangling investor attention by nature in terms of cognitive controls, we demonstrate that both the magnitudes and lifespan of such lead-lag relationships vary. A robustness check demonstrates that as a social tape, dynamics of investor attention on the social web has its own information content more than known behavioral biases.
TWeD Talk: Text Analysis of Large Metadata CatalogsTWeD Talk: Text Analysis of Large Metadata Catalogs
November 18, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, November 20, 2013, 7pm ET, Winslow Building on the RPI Campus

Please join us as TWC Ph.D. student Amar Viswanathan leads us through a discussion and demo of tools and methods used for analyzing large dataset catalogs!

ABSTRACT: We will demonstrate the application of traditional IR methods including entity extraction, tf-idf and (if time permits) topic modelling on large collections of metadata such as the International Open Government Dataset Catalog (IOGDS) --- over 1M datasets --- and the visualization of these results. The focus of this talk will be on demonstrating how to use certain simple tools to generate results and produce quick visualizations, including word clouds and graphs. We will also discuss how the kinds of analysis performed on IOGDS including languages, categories, and keywords maybe used as source data for a question answering system like IBM's Watson.
Ph.D. Thesis Defense Announcement for Alvaro GravesPh.D. Thesis Defense Announcement for Alvaro Graves
November 17, 2013
The Tetherless World Constellation is proud to announce the successful completion of Alvaro Graves' Thesis Defense.

TITLE: Improving the Use of Open Government Data Using Visualizations

ADVISOR: Professor James Hendler

ABSTRACT: The steady increase in Open Government Data (OGD) initiatives has created both opportunities and challenges for a wide range of users including government employees, journalists, researchers, scientists and engineers. Currently more than one million datasets have been made available by governments around the world, at national, regional and local levels. These datasets cover all activities in which governments are involved, namely: political boundaries, transportation networks, education performance, health related data, budgets and financial reports.

While these initiatives often report anecdotal success regarding improved efficiency and governmental savings, the potential applications of OGD remain a largely uncharted territory. In this work, we claim that there is an important portion of potential users who can benefit from the use of OGD, but who cannot do so because they cannot perform the essential operations needed to collect, process, merge, and make sense of the data. There are multiple reasons behind this, an important one being a fundamental lack of expertise and technical knowledge.

To mitigate these problems we propose the use of visualizations as a medium to consume, share and interact with data. The use of diverse styles of visualization has proven useful for understanding large quantities of data in multiple fields, ranging from military to economics to basic science. The problem with existing visualization tools and techniques is that they treat visualizations as finished artifacts; except in rare situations, currently tools do not empower users to explore how data was used, from where it was obtained and how it was displayed. Current tools do not help users create derivative visualizations from existing ones, forcing users who want only a slightly modified version to create it from scratch. We claim that the use of visualizations can greatly enrich the use of open government data if these constraints are overcome.

This thesis presents a study focused on facilitating the use of Open Government Data by people lacking a deep technical knowledge on how to use this data. One part of this study required identifying who these people are and discovering some of their most common problems related to the use of data, especially tasks that require the use of visualizations (whether to communicate the data, understand it, or any other function). It was also necessary for this thesis to create a tool that helped these people in different tasks that involve the use of visualizations and Open Government Data. The final part of this thesis presents the results of a user study showing how users can perform better in different tasks that involve the use of data using this tool. We will show that this tool provides a simpler environment for users to manipulate, create and share datasets and visualizations, leading to less effort and time on their part. The scope of this study includes stakeholders with an interest in Open Government Data including government employees, researchers and journalists.

Several people related to Open Government Data were interviewed for the purpose of identifying their needs and the problems they have with the use of Open Government Data. These people are related or interested in OGD, whether as a producer or as a consumer. Based on these interviews, we defined a set of use cases and Personas (a well-known technique used in HCI and User Design) to characterize the different profiles that represent users in a Open Government Data Ecosystem. Based on these Personas and use cases, a web-based tool, OpenDataVis, was written that allows these users to create, explore and reuse visualizations based on Open Government Data. These visualizations expose not only the data; but they also allow users to show basic provenance metadata. This provenance metadata keeps accountability of how each visualization was created, where the data comes from, and when it was obtained.

A user study was defined to compare the performance of people creating, analyzing and reusing visualizations using OpenDataVis versus their current favorite tool. Participants reported their experience of performing each task. We showed that in many cases users with basic technical knowledge can perform tasks in less time and with less effort using OpenDataVis than with conventional tools. We also showed than in other cases users could perform tasks that otherwise would not be possible for them to do with conventional tools.

The main contributions of this thesis are:
  • Evidence of the interest from non-technical experts in the consumption, exploration and creation of visualizations based on Open Government Data.
  • A definition of an Open Government Data Ecosystem (OGDE) is and the different stakeholders involved in this ecosystem.
  • A characterization of stakeholders of an Open Government Data Ecosystem, their skills, knowledge and needs as data providers/consumers.
  • The creation of a tool that allow users to create, explore and reuse visualizations based on Open Government Data.
  • An evaluation of users' performance using this framework and comparing it with the current situation.
TWeD Talk: Inside Watson: Exploring the DeepQA PipelineTWeD Talk: Inside Watson: Exploring the DeepQA Pipeline
November 12, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, November 13, 2013, 7pm ET, Winslow Building on the RPI Campus

Please join us for what promises to be an interesting TWed as Ph.D. student Simon Ellis provides some "deep" insights into Watson @ RPI!

The DeepQA pipeline is a composite engine made up of numerous sub-components that work together to answer questions put to the system in natural language. These components include NLP question analysis, search and search result processing, result typing, and scoring algorithms. The pipeline runs on the Unstructured Information Management Architecture (UIMA), a software platform designed for the development and deployment of multi-modal analytics for the analysis of unstructured information.
Policy Reasoning:  Game-Changing Promise and Present ChallengesPolicy Reasoning: Game-Changing Promise and Present Challenges
November 6, 2013
The Tetherless World Constellation at Rensselaer Polytechnic Institute welcomes K. Krasnow Waterman, LawTechIntersect LLC and MIT, on a talk about policy reasoning.

The talk will take place at 1:30pm ET on Wednesday, November 6, 2013 in the Winslow Building Room 1140 on the RPI campus.
TWeD Talk: The Rensselaer IDEA: A hub for data-intensive scientific discovery and innovation at RPITWeD Talk: The Rensselaer IDEA: A hub for data-intensive scientific discovery and innovation at RPI
October 29, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, October 30, 2013, 7pm ET, Winslow Building on the RPI Campus

Please join us for a very special TWeD as RPI IDEA Director and Constellation Prof. Jim Hendler leads us in a discussion of RPI IDEA!

The Rensselaer Institute for Data Exploration and Applications (IDEA) is responsible for leveraging the wealth of data science, high performance computing, predictive analytics, data visualization, and cognitive computing research being done at Rensselaer. The Rensselaer IDEA will be the hub for these and other multidisciplinary data-related programs and projects on campus, which range from health care, to business analytics, to smart buildings, to cybersecurity. The Rensselaer IDEA draws upon the power of four unique platforms: the CCNI supercomputing center; the Curtis R. Priem Experimental Media and Performing Arts Center; the Center for Biotechnology and Interdisciplinary Studies; and the IBM Watson cognitive computing system. According to RPI President Shirley Ann Jackson, “The Rensselaer IDEA is our university-wide effort to maximize the capabilities of these tools and technologies for the purpose of expediting scientific discovery and innovation, developing the next generation of these digital enablers, and preparing our students to succeed and lead in this new data-driven world.”
Fox gives ISWC 2013 keynoteFox gives ISWC 2013 keynote
October 24, 2013
Prof. Peter Fox delivered one of the keynote talks at the International Semantic Web Conference on Thursday 25 Oct. 2013 in Sydney Australia
TWeD Talk: A First Look at the Deep Carbon Observatory Data PortalTWeD Talk: A First Look at the Deep Carbon Observatory Data Portal
October 23, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, October 23, 2013, 7pm ET, Winslow Building on the RPI Campus

In support of the Deep Carbon Observastory the DCO-Data Science Team has adapted, extended, and integrated several open source applications and frameworks to create a novel Web-based collaborative research plaform well-suited to emerging science networks. In this talk the DCO-DS team will discuss how we have combined platforms including Drupal, VIVO, CKAN, and the Handle System in ways that leverage and reinforce knowledge networks inherent to the distributed research enterprise.

The DCO Data Portal is a Web-based service integrating an object-type repository, collaboration tools, an ability to identify and manage all key entities in the platform, and an integrated portal to manage diverse content and applications, with varied access levels and privacy options. In this informal talk we'll demonstrate how global science networks composed of people, diverse intellectual artifacts produced or consumed in research, organizational and/or outreach activities, as well as the relations among them can be modeled as knowledge networks, documented using formal ontologies and instantiated within platforms including the DCO Data Portal. Nodes within such networks may be people, organizations, datasets, events, presentations, publications, videos, meetings, reports, groups, and more. In such a heterogeneous ecosystem, common informatics approaches are used to co-design and co-evolve the needed research platforms to help ensure they reflect what real people want to use them for.
TWeD Talk: MealQA: A food-oriented intelligent assistantTWeD Talk: MealQA: A food-oriented intelligent assistant
September 23, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, September 25, 2013, 7pm ET, Winslow Building on the RPI Campus

During the summer internship at Samsung R&D Center, TWC grad student Yu Chen worked on the MealQA system, a food-oriented intelligent assistant which can answer natural language queries w.r.t. food and dishes. For example, given a natural language query such as "What's the best restaurant for bibimbap?" the system is able to provide a list of Korean restaurants that offer bibimbap (based on menus and other data), ranked according to sentiment analysis and entity extraction from available reviews. Yu's responsibility was to develop a ranking engine that provides the "most consistent" results at the front in the ranking list. The techniques include Bayesian networks and information theory. The dataset Yu trained the model on is based on Freebase, DBpedia and some proprietary datasets.
Professor James A. Hendler, Ph.D., appointed Director of the Rensselaer Institute for Data Exploration and Applications (IDEA)Professor James A. Hendler, Ph.D., appointed Director of the Rensselaer Institute for Data Exploration and Applications (IDEA)
September 17, 2013
Shirley Ann Jackson, Ph.D., President of Rensselaer Polytechnic Institute, announces the appointment of Professor James A. Hendler, Ph.D., as Director of the Rensselaer Institute for Data Exploration and Applications (IDEA).

James Hendler, Senior Constellation Chair and former Department Head, will now assume the responsibility of leveraging the wealth of data science, high performance computing, predictive analytics, data visualization, and cognitive computing research being done at Rensselaer. The Rensselaer IDEA will be the hub for these and other multidisciplinary data-related programs and projects on campus, which range from healthcare, to business analytics, to smart buildings, to cyber security.

Big data, broad data, high performance computing, data analytics, and web science are creating a significant transformation globally in the way we make connections, make discoveries, make decisions, make products, and ultimately, make progress. The Rensselaer IDEA is our university-wide effort to maximize the capabilities of these tools and technologies for the purpose of expediting scientific discovery and innovation, developing the next generation of these digital enablers, and preparing our students to succeed and lead in this new data-driven world.

The Rensselaer IDEA draws upon the power of four unique platforms: the Computational Center for Nanotechnology Innovations supercomputing center; the Curtis R. Priem Experimental Media and Performing Arts Center; the Center for Biotechnology and Interdisciplinary Studies; and, the IBM Watson cognitive computing system. Finding new and exciting opportunities to connect students and faculty with these powerful platforms will strengthen the position of Rensselaer as a world leader in data-related research.

Dr. Hendler is the Tetherless World Senior Constellation Chair at Rensselaer, and a member of the faculty in the Department of Computer Science and Department of Cognitive Science. Since joining the Institute in 2007, he has also served as head of the Department of Computer Science, and as assistant dean of the Information Technology and Web Science program.

Dr. Hendler received his bachelor's degree in computer science and artificial intelligence from Yale University, his master's degree in cognitive psychology and human factors engineering from Southern Methodist University, and his master's and doctorate degrees in computer science and artificial intelligence from Brown University. He has authored more than 200 technical papers in the areas of artificial intelligence, Semantic Web, agent-based computing, and high-performance processing.

One of the inventors of the Semantic Web, an extension of the World Wide Web that enables computers to better interpret the meaning and context of words, Dr. Hendler was a recipient of a 1995 Fulbright Foundation Fellowship. He is a former member of the U.S. Air Force Science Advisory Board, as well as a Fellow of the American Association for Artificial Intelligence, the British Computer Society, IEEE, and AAAS.

In addition to receiving numerous awards and accolades for his research and contributions to his field, Dr. Hendler serves as an "Internet Web Expert" for the U.S. government, providing guidance to the Data.gov project.
Dr. Joanne Luciano Gives Tutorial at Seventh IEEE International Conference on Semantic ComputingDr. Joanne Luciano Gives Tutorial at Seventh IEEE International Conference on Semantic Computing
September 15, 2013
The Seventh IEEE International Conference on Semantic Computing will occur September 16th-18th at the Hyatt Regency Irvine, Irvine, California, USA.

Dr. Luciano will give a tutorial and workshop on Semantic Computing in Health Care (TWSCHC 2013).
TWeD Talk: Introduction to Collustra Visual SPARQL EditorTWeD Talk: Introduction to Collustra Visual SPARQL Editor
September 11, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, September 11, 2013, 7pm ET, Winslow Building on the RPI Campus

Ph.D student Evan Patton will present, and demonstrate, an alpha version of a new application he has been working on with a group at MIT called Collustra, a query-by-example inspired HTML5 application. Following the presentation there will be a group discussion on how tools such as Collustra could be used within Tetherless World and elsewhere to encourage access to linked data and what improvements are necessary to foster this growth.
TWeD Talk: SOCIAM: The Social Machine Project at SOTONTWeD Talk: SOCIAM: The Social Machine Project at SOTON
September 4, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, September 4, 2013, 7pm ET, Winslow Building on the RPI Campus

Ph.D student Dominic DiFranzo leads us in a discussion about SOCIAM, the social machine project at the University of Southampton. Dominic's work on SOCIAM will focus on large scale Twitter data, the SOCIAM digital observatory, and building data tools for qualitative and quantitate researchers.
TWC/RPI Ph.D. Student Jin Zheng wins ESIP 2013 Summer Meeting Funding Friday CompetitionTWC/RPI Ph.D. Student Jin Zheng wins ESIP 2013 Summer Meeting Funding Friday Competition
July 12, 2013
Congrats to Jin Zheng for winning the Funding Friday Competition at the ESIP 2013 Summer Meeting with his proposal “Semantic Similarity Computation and Concept Mapping in the Earth and Environmental Sciences”. He is 1 of 3 to win the competition this meeting.

A fund of USD 3000.00 will be awarded to him to support his work, plus a travel and registration expense package for the ESIP 2014 Winter Meeting to support him to present his results.

Word is that this is the third ESIP meeting in a row where a member of the Tetherless World Constellation has won a Funding Friday Competition. Eric Rozell and Marshall Ma had won previously.
TWC Students, Faculty, and Staff participate in The Jefferson Project at Lake GeorgeTWC Students, Faculty, and Staff participate in The Jefferson Project at Lake George
June 27, 2013
At Rensselaer Polytechnic Institute, The Jefferson Project at Lake George partners the pioneering experimental methods of student and faculty researchers at the Darrin Fresh Water Institute with students and faculty across the Rensselaer campus, including those conducting leading-edge data and analytics research within the university’s Tetherless World Constellation.

"A three-year, multi-million dollar collaboration with the goal of understanding and managing complex factors—including road salt, storm water runoff and invasive species—threatening one of the world’s most pristine natural ecosystems and an economic cornerstone of the New York tourism industry. The collaboration partners expect that this world-class scientific and technology facility at Lake George will create a new model for predictive preservation and remediation of critical natural systems on Lake George, in New York, and ultimately around the world."
Launch of The Rensselaer Institute for Data Exploration and ApplicationsLaunch of The Rensselaer Institute for Data Exploration and Applications
June 10, 2013
On Thursday June 13, 10:00am EST at the Experimental Media and Performing Arts Center, Studio 2, The Rensselaer IDEA (Institute for Data Exploration and Applications will be launched. The Honorable Shirley Ann Jackson, Ph.D., President of Rensselaer Polytechnic Institute will be hosting, and a reception will follow the launch.

Rensselaer IDEA Evite
Keynote NFDP13: The Now and Now of DatKeynote NFDP13: The Now and Now of Dat
May 22, 2013
Keynote talk: Prof. Peter Fox delivered the opening keynote at the Now and Future of Data Publication. More details are available at http://tw.rpi.edu/web/event/NFDP2013
TWeD Graduate Research Lightning TalksTWeD Graduate Research Lightning Talks
May 1, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, May 01, 7pm ET

Starting with the Spring 2012 term TWC has held Graduate Research "Lightning Talks" sessions at the end of the term. This special TWed event is a great way for the greater TWC Community to learn of the wide range of amazing research happening at TWC, and "a good time (is) had by all!"
The PROV Family of Documents are W3C RecommendationsThe PROV Family of Documents are W3C Recommendations
April 30, 2013
The Provenance Working Group was chartered to develop a framework for interchanging provenance on the Web. The Working Group has now published the PROV Family of Documents as W3C Recommendations, along with corresponding supporting notes. You can find a complete list of the documents in the PROV Overview Note. PROV enables one to represent and interchange provenance information using widely available formats such as RDF and XML. In addition, it provides definitions for accessing provenance information, validating it, and mapping to Dublin Core. Learn more about the Semantic Web.

Participating from the Tetherless World Constellation were Tim Lebot, Jim McCusker, and Stephan Zednik.
TWeD Talk: Toward Linked Data Applications via MIT's App InventorTWeD Talk: Toward Linked Data Applications via MIT's App Inventor
April 22, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, April 24, 7pm ET

Are you interesting in combining Android app development and the Semantic Web? Join us in Winslow 1140 on Weds (7p) as TWC Ph.D. student Evan Patton leads us through a HANDS-ON tutorial, "Toward Linked Data Applications via MIT's App Inventor."
TWeD Talk: Semantic Web Development Methodology in Practice: The iChoose Ontology as a Use CaseTWeD Talk: Semantic Web Development Methodology in Practice: The iChoose Ontology as a Use Case
April 3, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, April 03, 7pm ET

A team of collaborators from the University of Albany and TWC RPI will introduce the NIST-funded Generalized Ontology Evaluation Framework (GOEF) as well as discuss its application in ontology-oriented projects.
Ph.D. Thesis Defense Announcement for Ankesh KhandelwalPh.D. Thesis Defense Announcement for Ankesh Khandelwal
March 29, 2013
The Tetherless World Constellation is proud to announce the successful completion of Ankesh Khandelwal's Thesis Defense. Ankesh is the fifth student to complete their Ph.D. within the Tetherless World Constellation.

Title: Furthering the Continuous-Change Event Calculus: Providing for Efficient Description of Additive Effects and an Automated Reasoner

Advisors: Peter Fox, James Hendler

Abstract: Semantic Web refers to a Web of interconnected data enriched with semantics. It subscribes to logic-based representations of knowledge through W3C standards such as the Resource Description Framework and the Web Ontology Language for encoding clear semantics. To date, knowledge representation however has been confined to descriptions of artifacts or data thus far. We began the research reported here in pursuit of the inclusion of knowledge about physical processes and natural laws, into the Semantic Web. Such knowledge could then be combined with experimental data, for example, in a largely automated fashion, for new inferences. In this pursuit we explored the extensive research in the field of reasoning about actions and changes, and deduced that the (circumscriptive) Event Calculus is the most expressive logic-based formalism available for logic-based description of continuous-changes.

In this thesis we extend the Event Calculus formalism with new predicates for descriptions of discrete and continuous additive effects whose semantics are given via aggregate formulas in first-order logic. To the best of our knowledge this is the first application of aggregate formulas in first-order logic, even though aggregates have been in use in other logics such as answer-set programming. The frame problem is one of representing the effects of actions without explicitly representing all their non-effects. Nonmonotonic reasoning via circumscription is used in the Event Calculus as a solution to the frame problem. We show, however, that circumscription which was defined for first-order logic without aggregates is inadequate for modeling the frame problem in the extended Event Calculus if used as it is for formulas with aggregates, as it selects anomalous models.

We extend the circumscription transformation to first-order logic with aggregates, named the CIRCA transformation. CIRCA transformation is a generic transformation that addresses a general problem of identifying and not selecting unintended models, classified formally as weak models, that circumscription normally selects in the presence of aggregates. We deploy CIRCA transformation for resolving the frame problem in the extended Event Calculus.

Finally, we devise a method for constructing models for given, numerical, and finite Event Calculus domain descriptions, given an initial state and narratives of external actions. An Event Calculus system evolves through altenating phases of continuous changes and instantaneous discontinuous changes. The devised method involves separation of logic and equations reasoning through syntactic derivations of new axioms from the given domain descriptions, such that discontinuous changes, equations for trajectories of continuous changes, and mathematical conditions for next discontinuous changes are determined from logic reasoning while trajectories of continuous changes and the time for next discontinuous change are determined from equations reasoning. With this separation, the state of the art logic reasoners and equation solvers can be combined to implement an automated model builder for the Event Calculus. We have implemented a prototypical reasoner using the DLVHEX logic reasoner and Mathematica libraries.

The results of this thesis may encourage the use of logic formalisms/systems for descriptions of dynamical systems with quantitative descriptions of continuous- changes. Additive effects are very common in concurrent systems, and the extended Event Calculus allows for general, concise and elaboration tolerant descriptions of them, which among other things makes the descriptions amicable to sharing, reuse, and modular development. The prototypical model-builder for the continuous-change Event Calculus formalism broadens its scope beyond theory, positioning it for use in practice. Finally, we hope that these are some crucial steps towards realizing a process modeling language for the Semantic Web alluded to in the beginning.
TWeD Talk: Building Semantic Web Applications using Java ServletsTWeD Talk: Building Semantic Web Applications using Java Servlets
March 27, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, March 27, 7pm ET

TWC Graduate student Evan Patton will lead us through an interactive tutorial on using Java Servlets to create semantically-enabled Web applications.
RDA Keynote: Can it get any more important than this?RDA Keynote: Can it get any more important than this?
March 18, 2013
Prof. Peter Fox delivered the opening keynote at the first Research Data Alliance Plenary in Gothenburg, Sweden, indicating 5 key perspectives for consideration. More details are available on RDA at http://www.rd-alliance.org/
TWeD Talk: Exploring Avenues for Research in Web ScienceTWeD Talk: Exploring Avenues for Research in Web Science
March 5, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk and Web Science poster session, Wednesday, March 06, 6pm ET

Beginning at 6pm in the Winslow Building on the RPI Campus, the Tetherless World Constellation will host a Web Science poster session in room 1140 followed at 7pm in the 2nd floor commons area by a discussion of Web Science research topics, organized and led by graduate student Kristine Gloria.
RPI's Tetherless World Constellation Team wins Health Data Platform ChallengeRPI's Tetherless World Constellation Team wins Health Data Platform Challenge
February 20, 2013
The Tetherless World Constellation (TWC) at Rensselaer Polytechnic Institute (RPI) is proud to announce that a team of RPI students has won first place in the HealthIT.org Health Data Platform (HDP) Metadata Challenge. The Office of the National Coordinator for Health Information Technology's HDP challenges were designed to create new functionalities for the U.S. Department of Health and Human Services' repository for open health data, HealthData.gov.

The Metadata Challenge was launched on June 5, 2012 in order to facilitate the application of common metadata standards to all open government data. Additionally, the challenge sought to improve designs for health specific metadata. The challenge entries were judged on the number of metadata and data sets the app was designed to accommodate, the use of open source software and the incorporation of best practices.

For the Metadata challenge, RPI's team of Jim McCusker, Timothy Lebo, Alvaro Graves and Kristine Gloria won the $20,000 first place award with an application that leveraged the healthdata.gov CKAN API (Application Programming Interface) and the complete catalog of datasets on healthdata.gov to create multiple resources for organizing data and automating many of the data processes. The Tetherless World Team presented a set of in-house developed tools enabling the discovery of, access to, and integration of the Health and Human Services’ datasets as Linked Government Data.

TWeD Talk: A Semantic Tour of F#TWeD Talk: A Semantic Tour of F#
February 20, 2013
There's always something happening on Wednesday evenings in the Tetherless World!

TWeD Talk, Wednesday, February 20, 7pm ET

TWC graduate student Amar Viswanathan has been exploring Microsoft Research's "TryF#" portal http://www.tryfsharp.org/, an online site for F# learning, code development and deployment, and the development of F# "type accessors" for use with RDF.
Ph.D. Thesis Defense Announcement for Gregory WilliamsPh.D. Thesis Defense Announcement for Gregory Williams
February 19, 2013
The Tetherless World Constellation is proud to announce the successful completion of Gregory Williams' Thesis Defense. Greg is the fourth student to complete their Ph.D. within the Tetherless World Constellation.

TITLE: Planning and Evaluation of Federated Queries on the Web

ADVISOR: Professor James Hendler

ABSTRACT: The Web of Data continues to increase in size and diversity, providing access to large amounts of structured, linked data. However, existing approaches to querying this data often fail to make use of existing database access points and must resort to web crawling to collect data of interest. Furthermore, in order to provide efficient query answering over this data existing systems are forced to construct centralized database indexes, making it difficult to maintain up-to-date data. For approaches that do utilize existing databases, disregard for fundamental design principles of the Web results in query systems that lack some basic features of their web crawling counterparts. If an efficient query answering system can be provided that does not require centralized indexing, and leverages both existing databases and static web content, users may benefit from up-to-date access to structured, disparate data.

In this dissertation, we develop a federated query planning framework based on the RDF data model and the SPARQL query language. This framework is able to leverage the high performance of existing SPARQL databases while also providing access to linked data available as RDF documents on the web. These two access methods are used to provide a single interface to querying semantic data.

The primary challenge of evaluating queries over both SPARQL databases and linked data is in finding an efficient execution plan. Such a plan must perform better than the naive approach of completely decomposing the query and executing each subquery against each data source or traversing linked data by web crawling. Moreover, it must allow metadata discovered during query execution to be incorporated into the existing plan.

Given this, in this dissertation we develop three techniques to increase performance and flexibility of federated query evaluation: we develop a federated query planning algorithm that prioritizes the execution of subqueries that have high expected value (that is, expected relevant results with low latency); we develop a re-planning algorithm, able to augment an existing query plan with newly discovered data sources and a mechanism for discovering such sources; and we develop a server-side technique to greatly enhance the web cacheability of SPARQL query results.

Finally, the developed framework is designed using a traditional query planner, allowing it to integrate with and benefit from existing work on query planning and optimization.

To demonstrate the practicality of this federated query planning framework, we present results of empirical evaluation of the framework components over a real-world dataset of bibliographic data. These results show that the federated query planning, evaluation, and caching techniques are able to produce query results quickly and efficiently. The effects of several optimizations on the execution of federated queries is discussed, and their impact on performance is evaluated.
Prizms: Better Visualizations Catalyzed by Better DataPrizms: Better Visualizations Catalyzed by Better Data
February 12, 2013
Timothy Lebo, Ph.D. student with the Tetherless World Constellation, presents at this weeks TWeD on Wednesday night, February 13th.

Currently, Prizms is an agglomeration of several tools developed by the Tetherless World Constellation, each of which has been used to create a variety of semantic web applications. Together, they provide the basis for an end-to-end system that can be used for future applications that require knowledge-intensive, broad-data solutions.
Ph.D. Thesis Defense Announcement for Jesse WeaverPh.D. Thesis Defense Announcement for Jesse Weaver
February 4, 2013
The Tetherless World Constellation is proud to announce the successful completion of Jesse Weaver's Thesis Defense. Jesse is the third student to complete their Ph.D. within the Tetherless World Constellation.

TITLE: Toward Webscale, Rule-based Inference on the Semantic Web via Data Parallelism

COMMITTEE: James A. Hendler, Christopher Carothers, Peter Fox, David Mizell (external member from YarcData)

FELLOWSHIP: Dr. Shirley Ann Jackson and Dr. Morris A. Washington Patroon Fellowship

ABSTRACT: This thesis considers the problem of scaling rule-based inference to large quantities of RDF data found on the Semantic Web. The general approach is one of data parallelism, that is, dividing data among processors such that the collective results of each processor's individual inference is the same as though inference was performed sequentially. In this way, theoretically speaking, more processors can be added to accommodate more data.

The problem is first considered from the perspective of the operational semantics of inference with production rules. The question is asked, under what conditions is embarrassingly parallel inference guaranteed to be correct? Sufficient conditions are determined and proven at both a fine-grained level close to the basic operational semantics and a more coarse-grained level that applies directly to rules. The conditions are placed on the relationship between rules and distribution schemes, that is, the way in which data is assigned to processors.

Then, a special class of distribution schemes is considered called replication schemes. Replication schemes require that individual data either be replicated to all processors or placed arbitrarily on some processor(s). The aforementioned conditions are then reformulated to consider replication schemes which reveals that testing the conditions for replication schemes is reducible to satisfiability (SAT), and not only SAT but 2SAT. An augmented version of this reduction which is a reduction to 3SAT also accounts for the possibility to eliminate some rules in order to improve parallelization. These reductions along with a proposed methodology for restricting rules are used to derive restricted versions of the RDFS and OWL2RL rules that are amenable to parallel inference.

Finally, an evaluation is performed that tests these theoretical findings for restricted versions of RDFS and OWL2RL inference on two large, well-known datasets exceeding a billion triples: LUBM10K and BTC2012. The LUBM10K dataset represents an optimistic case, meaning that if performance is poor with LUBM10K, then it will likely be poor on many datasets. On the other hand, the BTC2012 dataset represents a pessimistic case, meaning that if performance is good with BTC2012, then it is likely that performance will be good with other datasets. While the usual scalability metrics are used (speedup, efficiency, etc.), the Karp-Flatt metric reveals that inference is almost entirely parallel for LUBM10K data, demonstrating the practical feasibility of the theoretical findings. However, for BTC2012, it must be ensured that there is sufficient memory and load-balancing to achieve this high level of scalability on distributed memory architectures. Regardless, for feasible cases, very low times are achieved for LUBM10K (seconds) and BTC2012 (minutes).
"healthdata.tw.rpi.edu: TWC's Response to the HHS Health Data Platform Metadata Challenge""healthdata.tw.rpi.edu: TWC's Response to the HHS Health Data Platform Metadata Challenge"
January 23, 2013
The Spring 2013 TWed Talks begin with Jim McCusker, Tim Lebo, and Alvaro Graves providing an overview of healthdata.tw.rpi.edu, their submission to the Department of Health and Human Services (HHS) Developer Challenge that demonstrates the use of Linked Data to "establish learning communities that collaboratively evolve and mature the utility and usability of a broad range of health and human service data".
Launch of the thematic series on Semantic Technologies in Healthcare and Life SciencesLaunch of the thematic series on Semantic Technologies in Healthcare and Life Sciences
December 17, 2012
Joanne S. Luciano, Research Professor at the Tetherless World Constellation, is happy to announce the Journal of Biomedical Semantics launch today of the thematic series on Semantic Technologies in Healthcare and Life sciences….

http://www.jbiomedsem.com/series/SWAT4LSCSHALS

Semantic technologies in healthcare and life sciences
Edited by: Prof Jonas Almeida, Dr Albert Burger, Prof Joanne Luciano, Dr Andrea Splendiani

Collection published: 17 December 2012

This thematic series focuses on the application of web based technologies for knowledge representation and data integration in life sciences, that seek to facilitate biomedical research and healthcare practice. The series originates in research presented at two conferences, SWAT4LS (Semantic Web Application and Tools for Life Sciences) which is held annually in Europe, and CSHALS (Conference on Semantics in Healthcare and Life Sciences), which is held annually in the United States. These two venues foster critical discussions on the limits, challenges, and opportunities in the adoption of semantic web technologies in healthcare and life sciences.

The emergence of the Web as the primary communication medium; the ever increasing amount of biomedical information and the convergence of disciplines in the biomedical spectrum are all phenomena that point at the Web as a promising technology platform to increase the efficiency of biomedical research and healthcare delivery. At the same time, they make evident the need for semantic approaches in order to integrate information that arises from different processes and disciplines in a meaningful way. At this convergence of Web and semantic solutions we are witnessing a front of innovation where various approaches, including Semantic Web and Linked Data solutions, are proposed.

The objective of this thematic series, and of the events that underpin it, is to explore this front of innovation, both in the critical assessment of current technologies and in novel proposals. The series is open for any submission that fits into these objectives.

Research
Semantically enabling a genome-wide association study database
Tim Beck, Robert C Free, Gudmundur A Thorisson, Anthony J Brookes
Journal of Biomedical Semantics 2012, 3:9 (17 December 2012)

Research
Analysing Syntactic Regularities and Irregularities in SNOMED-CT
Eleni Mikroyannidi, Robert Stevens, Luigi Iannone, Alan Rector
Journal of Biomedical Semantics 2012, 3:8 (17 December 2012)

Software
COEUS: "semantic web in a box" for biomedical applications
Pedro Lopes, José Luís Oliveira
Journal of Biomedical Semantics 2012, 3:11 (17 December 2012)

Research
Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank
Jyotishman Pathak, Richard C Kiefer, Suzette J Bielinski, Christopher G Chute
Journal of Biomedical Semantics 2012, 3:10 (17 December 2012)
Alan Chartock In Conversation w/ Dr. Nigel Shadbolt, Dr. Jim Hendler, Jeanne Holm, Dr. Theresa PardoAlan Chartock In Conversation w/ Dr. Nigel Shadbolt, Dr. Jim Hendler, Jeanne Holm, Dr. Theresa Pardo
December 13, 2012
Alan Chartock of WAMC (Northeast Public Radio) is joined by attendees at the International Conference on the Theory and Practice of Electronic Governance, a meeting at the University at Albany that brings together hundreds of technology and government experts from around the world.
TWeD Lightning TalksTWeD Lightning Talks
December 11, 2012
On Wednesday, December 12, 2012, the Tetherless World Constellation will be hosting Lighting Talks as part of our TWeD series. The talks will be from students in the lab talking about the projects and research that they are working on.
Joshua Shinavier on Semantic Sensor NetworksJoshua Shinavier on Semantic Sensor Networks
November 27, 2012
Joshua Shinavier, a PhD student in Computer Science in the Tetherless World Constellation, as well as a very active software developer presents this Wednesday for TWeD on "Semantic Sensor Networks". This will be a high-level introduction to the emerging domain of semantic sensor networks, illustrated with live demos of tools and techniques developed at TWC. We will explore sensor data as RDF streams, continuous SPARQL and stream reasoning, as well as potential applications in ubiquitous and wearable computing.
TWeD Talk: Active and Social Data Curation: Using Semantic Content Management to Reinvent the Business of Community-scale Lifecycle Data ManagementTWeD Talk: Active and Social Data Curation: Using Semantic Content Management to Reinvent the Business of Community-scale Lifecycle Data Management
November 5, 2012
Jim Myers, Director of The Sustainable Environments-Actionable Data (SEAD) Project, Director of RPI CCNI and Clinical Professor of Computer Science, will present.

Effective long-term curation and preservation of data for community use has historically been limited to high value and homogeneous collections produced by mission-oriented organizations. Straight forward extension of current models leads to the call for training, evangelism, formal vocabularies up front, and vastly increased funding as the best means of broadening community-scale data management. Within the Sustainable Environments-Actionable Data (SEAD) project, we question this reasoning and are exploring how alternative approache focused on the overall data lifecycle, using lightweight semantic content management, and acknowledging the sociological and business realities of distributed multidisciplinary research communities might dramatically lower costs, increase value, and consequently drive dramatic advances in our ability to use and reuse data, and ultimately enable more rapid scientific advance. Specifically, we've introduced the concepts of active and social curation as a means to decrease coordination costs, align costs and values for individual data producers and data consumers, and improve the immediacy of returns for data curation investments. In this presentation, Jim Myers will describe his team's thinking and present a bit of the specific architecture and services - tools you can use! - for active and social curation that are being prototyped within the SEAD project within NSF's DataNet network and discuss how they are motivated by the long-tail dynamics in the cross-disciplinary sustainability research community we're supporting.
Semantic Tech In Service Helping First Responders Stay SafeSemantic Tech In Service Helping First Responders Stay Safe
October 29, 2012
It’s probably safe to say that people want their firefighters, EMT, law enforcement and other emergency responders to be as best-equipped for their jobs as possible, so that they can be successful and well-protected, too.

Semantic technology can have a hand in making sure that happens. Deborah McGuinness, Tetherless World Senior Constellation Professor and Director of Web Science Operations John Erickson. both of Rensselaer Polytechnic Institute (RPI) , are spearheading an effort, thanks to some funding from the National Institute of Standards and Technology(NIST) under program manager William G. Billotte, to use semantic technology and social media to help that organization better understand what requirements should be for these heroes.
World Wide Web Expert Jim Hendler Receives Inaugural Strata “Big Data” AwardWorld Wide Web Expert Jim Hendler Receives Inaugural Strata “Big Data” Award
October 26, 2012
Jim Hendler, head of the Department of Computer Science and senior constellation professor in the Tetherless World Constellation at Rensselaer, has been honored with an inaugural Strata Data Innovation Award, given to individuals who have made significant innovations in the data field.
Introducing the Tetherless World Constellation's Web Science Research CenterIntroducing the Tetherless World Constellation's Web Science Research Center
October 10, 2012
Joanne Luciano will be introducing the Tetherless World Constellation's Web Science Research Center during a Web Science Trust Network (WSTNet) webinar on October 10, 2012 from 11am-12 noon Easter Time. Online Registration is open, and you can find out more information at the TWC WSTNet Site.
ESIP Grant for Xiaogang MaESIP Grant for Xiaogang Ma
August 9, 2012
Xiaogang (Marshall) Ma, a postdoctoral research associate at TWC, won the FUNding Friday Competition at the ESIP Summer Meeting 2012, Madison, WI, USA for his proposal Exploratory visualization of earth science data in a Semantic Web context.

The Federation of Earth Science Information Partners (ESIP), is an open networked community that brings together science, data and information technology practitioners in the United States. The FUNding Friday Competition takes place at the annual summer meeting of ESIP. It awards three US$ 5,000 mini-grants to fund small projects that are inspired by ESIP collaboration or participation. Marshall won the grant at the ESIP Summer Meeting 2012 in July. He will carry out a study to chain up semantic web technologies, data visualization and online earth science data to develop exploratory functions for information and knowledge discovery. Marshall was also invited to give a presentation of his research results at the ESIP Winter Meeting 2013 to be held at Washington, DC in January 2013.
Five Questions for July 9, 2012: Jim HendlerFive Questions for July 9, 2012: Jim Hendler
July 9, 2012
Rensselaer Polytechnic Institute professor Jim Hendler is entering his fifth year at the university, where he was recently named the head of the computer science department. He will continue to teach a web science course once a year in addition to his new administrative duties.
Jim Hendler to Head CS Dept at RPIJim Hendler to Head CS Dept at RPI
June 26, 2012
Professor Jim Hendler has been named the new head of the Department of Computer Science at Rensselaer Polytechnic Institute. Hendler is currently a senior constellation professor in the Tetherless World Constellation and program director of the Information Technology and Web Science (ITWS) program at Rensselaer. He will be stepping down from his leadership of the ITWS Program to assume the department head post.
Deborah McGuinness to co-give tutorial titled Ontology 101: An Introduction to Knowledge Representation and Ontology DevelopmentDeborah McGuinness to co-give tutorial titled Ontology 101: An Introduction to Knowledge Representation and Ontology Development
June 4, 2012
Deborah McGuinness, Tetherless World Senior Constellation Chair and Professor of Computer and Cognitive Science, and the founding director of the Web Science Research Center at Rensselaer Polytechnic Institute, along with Elisa Kendall, who has over 30 years professional experience in the design, development and deployment of enterprise-scale information management systems, will be co-giving a tutorial at the Semantic Technology and Business Conference in San Francisco, CA, Monday, June 4 2012 at 1:15 PM Pacific Time.
Tetherless World Students Set to Participate in Internships Across the CountryTetherless World Students Set to Participate in Internships Across the Country
June 4, 2012
Every summer students go out into the world to gain more experience and make new connections.
Here are some of the exciting opportunities TWC students are taking part in.



Undergraduate Students
  • Alexei Bulazel - First ever Data.gov intern
  • Alvin Lee - DNS Engineer at Comcast
  • Stephen McAuliffe - Tetherless World Constellation, Semantic eScience Framework
  • Ali Nendick - Tetherless World Constellation, Semantic eScience Framework
Graduate Students
  • Evan Patton - Qualcomm Corporate R&D systems Engineering Intern
  • Eric Rozell - Microsoft Research Intern
  • Jin Guang Zheng - DC
  • Linyun Fu - GE
  • Ping Wang - Amazon - Software Dev. Engineer
  • Benno Lee - NASA
  • Thiru Manikandakumarasamy - Bank of America
  • Sapan Shah - Barclays
Eric Meyer talk - May 31, 2012 - Digital Transformations of ResearchEric Meyer talk - May 31, 2012 - Digital Transformations of Research
May 29, 2012
Eric Meyer ,Research Fellow, Oxford Internet Institute,University of Oxford will be visiting the Tetherless World Constellation on Thursday, May 31. He will be giving a talk at 2:00pm in Winslow 1140 titled "Digital Transformations of Research".

Read More ...
Mark Musen talk May 10, 2012 4:00 P.M. Troy Rm 2012, When Professionals Codify Their Knowledge: Making Biology and Medicine "Computable"Mark Musen talk May 10, 2012 4:00 P.M. Troy Rm 2012, When Professionals Codify Their Knowledge: Making Biology and Medicine "Computable"
May 2, 2012
Formal representations of professional knowledge are playing increasingly important roles in biomedicine. From the London Bills of Mortality, created in the seventeenth century, to the Gene Ontology, a resource developed during the past decade as an essential tool for work in many areas of modern biology, the biomedical community has latched onto the idea of encoding scientific and clinical knowledge for use by computers. Workers in health care and in the life sciences now take it for granted that professional societies will develop and promote the use of codified knowledge online. At the same time, the rush to develop formal ontologies in biomedicine has led to some rather questionable decision making. Governments have mandated the use of coding systems and ontologies in health care that are based on flawed models or flawed use of knowledge-representation systems. Biologists have been attracted to the promise of "ontological realism" as a foundation for scientific ontologies—often boxing them into a difficult philosophical corner. In this talk, I will examine the history of formal systems for representing knowledge in biomedicine, and I will discuss some of the technical and political difficulties that now confront workers in clinical medicine and the life sciences.
Professor Peter Fox awarded 2012 Ian McHarg Medal at EGU 2012Professor Peter Fox awarded 2012 Ian McHarg Medal at EGU 2012
April 27, 2012
The 2012 Ian McHarg Medal is awarded to Peter Fox for his contribution to recognizing the fundamental importance of establishing informatics as a genuine discipline within the Earth Sciences.
Deborah L. McGuinness to Give Keynote at the Linked Open Data and Team Science WorkshopDeborah L. McGuinness to Give Keynote at the Linked Open Data and Team Science Workshop
April 11, 2012
Deborah L. McGuinness is giving a keynote at the Linked Open Data and Team Science workshop on April 19, co-located with the Science of Team Science Conference at Northwestern University.
Deborah L. McGuinness to Give Keynote at OpenTravel 2012 Advisory Forum Deborah L. McGuinness to Give Keynote at OpenTravel 2012 Advisory Forum
April 10, 2012
Deborah L. McGuinness will be giving the opening keynote entitled "The Potential of Semantic Technologies in Travel" on Wednesday April 11th, 2012 at the OpenTravel 2012 Advisory Forum in Miami, Florida.
Jiao Tao earns her Ph.D.Jiao Tao earns her Ph.D.
March 19, 2012
Jiao Tao is now Dr. Tao after passing her Thesis Defense of her Thesis Integrity Constraints for the Semantic Web: An OWL 2 Description Logic Extension on March 19, 2012.
Constellation Professor Fox Presented with Lifetime Achievement AwardConstellation Professor Fox Presented with Lifetime Achievement Award
January 4, 2012
January 4, 2012 -- The Federation for Earth Science Information Partners presented held its annual awards ceremony during the Winter ESIP Federation meeting. Professor Peter Fox of the Tetherless World Constellation at Rensselaer Polytechnic Institute was the recipient of this year's Martha Maiden Lifetime Achievement Award for Outstanding Service to the Earth Science Information Community. The award was instituted in 2009. (More Information)

In presenting the award to Professor Fox, Mark Parsons of the National Snow and Ice Data Center offered, "Everyone knows and respects Peter. It took me about five minutes to get a half dozen top experts from around the world to support Peter's nomination. And it's clear why." Mr. Parsons went on to note Dr. Fox's involvement with many of the leading science and data organizations in the world such as AGU, EGU, IUGG, the International Council of Science and CODATA, or through the dozens of informatics committees Professor Fox has led or inspired.

In accepting the award, Professor Fox said, "This award in particular thus is very special because to me, it's about what I have done, and maybe how I've done it rather than who I actually am." In thinking about the ESIP Federation and other community organizations, Dr. Fox noted, "For this community really to go forward it must complement the highly application-based nature of this organization and organizations like it with a very strong academic component to this community. We have some of those people in this room but we certainly don't have enough. We don't only produce things, but we think about how we're producing them so that our sustainable future is based on an understanding of how we do things."
Rensselaer Professors Gilbert and Hendler Selected as 2011 AAAS FellowsRensselaer Professors Gilbert and Hendler Selected as 2011 AAAS Fellows
December 22, 2011
Two members of the Rensselaer Polytechnic Institute science faculty have been selected as fellows of the American Association for the Advancement of Science (AAAS). Susan Gilbert, professor and head of biology, and James Hendler, senior constellation professor in the Tetherless World Constellation and head of the information technology and web science program, are two of the 539 newly selected AAAS fellows. They were recognized for their efforts to advance science or its applications that are deemed scientifically or socially distinguished, according to AAAS. The announcement will be made in the Dec. 23, 2011, issue of the journal Science.
Learn Ontology Development at SemTechBiz DCLearn Ontology Development at SemTechBiz DC
November 23, 2011
Elisa Kendall, Partner – Thematix LLC and Deborah McGuinness, Tetherless World Chaired Constellation Professor – Rensselaer Polytechnic Institute (RPI) will be giving the featured presentation at SemTechBiz DC The conferences takes place at the Kellogg Conference Hotel in Washington, DC November 29-December 1, 2011, and will feature highly-anticipated sessions led by top thinkers in the fields of Open Government, Content Management, Enterprise Data Management, Linked Data, and much more. Deborah L. McGuinness
Shinichiro Nakamura talk 4:00pm 11/16/11, Limits, Losses, and Environmental Impacts of Recycling of Ferrous Materials Embedded in End-of-Life Vehicles: Hybrid Analysis with Explicit Consideration of Scrap QualityShinichiro Nakamura talk 4:00pm 11/16/11, Limits, Losses, and Environmental Impacts of Recycling of Ferrous Materials Embedded in End-of-Life Vehicles: Hybrid Analysis with Explicit Consideration of Scrap Quality
November 14, 2011
A talk by Shinichiro Nakamura, Faculty of Political Science and Economics, Waseda University, Tokyo, Japan and Ecotopia Science Institute, Nagoya University, Nagoya, Japan. The talk will take place at 4:00pm on Wednesday, November 16, 2011 in room 1140 of the Winslow Building, Rensselaer Polytechnic Institute. Read more...
Boyan Brodaric talk 1:30pm 11/15/11, Ontology Design for Theory-driven Semantic e-ScienceBoyan Brodaric talk 1:30pm 11/15/11, Ontology Design for Theory-driven Semantic e-Science
November 14, 2011
A talk by Boyan Brodaric, Research Scientist, Geological Survey of Canada in Ottawa Canada. The talk will take place at 1:30pm on Tuesday, November 15, 2011 in room 1140 of the Winslow Building, Rensselaer Polytechnic Institute. Read more...
Vasco Furtado talk 10:30am 11/08/11, The WikiCrimes System: Research and Development in Collaborative MappingVasco Furtado talk 10:30am 11/08/11, The WikiCrimes System: Research and Development in Collaborative Mapping
November 8, 2011
A talk by Vasco Furtado, Professor, Computer Science at the University of Fortaleza, Brazil. The talk will take place at 10:30am on Tuesday, November 8, 2011 in room 1140 of the Winslow Building, Rensselaer Polytechnic Institute. Read more...
TWC Team Wins Triplification Challenge at I-SEMANTICS TWC Team Wins Triplification Challenge at I-SEMANTICS
September 15, 2011
The annual I-SEMANTICS Conference hosted its fourth Triplification Challenge, an event aimed at stimulating the availability of large quantities of RDF data and showcasing practical applications built on that data. The Challenge consisted of an unspecific open data track and a dedicated open government data track for which winners were selected. The prize money of 1000 Euro each was sponsored by Wolters Kluwer Germany. The open government data track award was won by the TWC team of John Erickson, Yongmei Shi, Li Ding, Eric Rozell, Jin Zheng and Jim Hendler for their contribution TWC International Open Government Dataset Catalog (IOGDC). Ph.D. student Eric Rozell presented the team's work and accepted the award on behalf of the team. Eric's innovative S2S faceted browser is featured in the IOGDC live demo.
Medha Atre becomes first Tetherless Ph.D.Medha Atre becomes first Tetherless Ph.D.
August 19, 2011
Medha Atre is now Dr. Atre after passing her thesis defense on August 19th. She is the first Tetherless World Constellation doctoral student to receive a Ph.D. since the constellation was formed 4 years ago by Dr. Jackson.
Ph.D. Thesis Defense Announcement for Medha AtrePh.D. Thesis Defense Announcement for Medha Atre
August 15, 2011
Tetherless World Constellation
Department of Computer Science
Thesis Advisor: Professor James A. Hendler

Bit-by-Bit: Indexing and Querying RDF data using Compressed Bit-Vectors

Friday, August 19, 2011
Winslow 1140 - 2:30 p.m.

The Resource Description Framework (RDF) is widely being adapted as a for information representation in various domains, e.g., biotechnology (UniProt), government data (the data.gov project), scholarly resources (DBLP), web resources (DBPedia), connection between people (FOAF) and many more. Due to the increasing use of RDF as a standard for data representation in the past few years, the amount of RDF data available on the web has increased at a break-neck speed. This has necessitated addressing two important issues -- efficient ways of (a) storing, and (b) querying the RDF data.

In this thesis, we propose a novel way of storing RDF data and an efficient way of processing SPARQL join queries -- BitMat -- which is specifically aimed at low-selectivity SPARQL join queries. BitMat uses the well-known technique of compressed bit-vectors to store RDF data by representing it as a 3-dimensional bit-cube (subject, predicate, object as each dimension of the bitcube). The key aspect of our join query processing is a novel 2-phase algorithm combining the idea of semi-joins and multi-way joins.

We have also extended BitMat's technique of /pruning/ the candidate RDF triples to process DISTINCT and OPTIONAL clauses in SPARQL. Specifically for the DISTINCT clause, BitMat's pruning phase can be used to omit BitMats that are not needed for final result generation. This further enhances the memory requirements of the query processor.

We have also given a detailed description of how other SPARQL clauses can be handled -- either using BitMat algorithm, or by working on the results of BitMat algorithm.

RDF data is primarily graph data, hence we have also proposed to extend the technique of compressed bit-vectors used in BitMat to solve /label order constrained reachability/ queries on RDF graph. We propose a system -- BitPath -- which uses compressed bit-vectors to build specific indices for each node in the RDF graph and a query processing algorithm based on greedy pruning and divide-and-conquer approach to solve this problem.

Thus in this thesis we are addressing two important aspects of querying RDF data -- (a) performance intensive SPARQL queries, and (b) label-order-constrained-reachability queries. We show that using characteristics of compressed bit-vectors in both -- BitMat and BitPath algorithms -- we can come up with efficient solutions to both the problems.
To Share Grievances, Microblogging the Frustrations of FlightTo Share Grievances, Microblogging the Frustrations of Flight
August 1, 2011
I’M not the most talkative person on airplanes. I used to be much friendlier, but an inevitable question is, “What do you do?” When I say, “I’m a professor of computer science,” I always wind up doing a lot of free consulting.
Wine Agent Highlighted on CBC's SparkWine Agent Highlighted on CBC's Spark
June 26, 2011
Right now the race is on to develop search that is both more accurate and more customized. What if there was an app that could act as an agent for you, taking all the meta-data from searchable pages, filtering and arranging them to make sense for you, and then giving you the perfect search result? You and your computer would work cooperatively to find exactly what you were looking for. This idea is called the semantic web – the ability to do search with meaning, rather than just syntax. Deborah McGuinness is a Professor of Computer Science and Cognitive Science at Rensselaer Polytechnic Institute, and the Chair of the Tetherless World Constellation. She tells Nora about a wine agent she’s developed as an early semantic web demonstration.
Elsevier and Tetherless World to Host Health and Life Sciences Semantic Web Hackathon (27-28 June 2011)Elsevier and Tetherless World to Host Health and Life Sciences Semantic Web Hackathon (27-28 June 2011)
June 6, 2011
Create apps; Win Prizes!

The Tetherless World Constellation at RPI is excited to announce that TWC and Elsevier's SciVerse Developer Network will be holding a 24-hour Health and Life Sciences Semantic Web Hackathon 27-28 June 2011. The Elsevier-sponsored event will be held at the beautiful Pat's Barn on the campus of the Rensselaer Technology Park in Troy, NY.

Participants will compete with each other to develop apps using linked data from TWC and other sources, web APIs from Elsevier SciVerse, and visualization and other resources from around the Web.

Event website http://tw.rpi.edu/web/event/TWCElsevierHackathonJune2011 Registration at: http://twcsciverse2011.eventbrite.com/

Semantic Technologies Bear Fruit In Spite of Development ChallengesSemantic Technologies Bear Fruit In Spite of Development Challenges
March 4, 2011

Despite the complexities associated with semantic technologies, efforts to adopt the approach for drug development are bearing fruit, according to several presentations at last week's Conference on Semantics in Healthcare and Life Sciences in Cambridge, Mass.

This year's conference began with a series of hands-on tutorials coordinated by Joanne Luciano, a research associate professor at Rensselaer Polytechnic Institute, that were intended to show how the technology can be used to address drug development needs.

During the tutorials, participants used semantic web tools to create mashups using data from the Linked Open Data cloud and semantic data that they created from raw datasets. Participants were shown how to load data into the subject-predicate-object data structure dubbed the "triple store;" query it using the semantic query language SPARQL; use inference to expand experimental knowledge; and build dynamic visualizations from their results.

Semantic Sommelier Press ReleaseSemantic Sommelier Press Release
February 23, 2011
In the restaurant of the future, you will always enjoy the perfect meal with that full-bodied 2006 cabernet sauvignon, you will always know your dinner companions’ favorite merlot, and you will be able to check if the sommelier’s cellar contains your favorite pinot grigio before you even check your coat. These feats of classic cuisine will come to the modern dinner through the power of Semantic Web technology. Read more at the source.
Web Experts Ask Scientists To Use the Web To Improve Understanding, Sharing of Their Data in Science MagazineWeb Experts Ask Scientists To Use the Web To Improve Understanding, Sharing of Their Data in Science Magazine
February 14, 2011
Peter Fox and James Hendler of Rensselaer Polytechnic Institute are calling for scientists to take a few tips from the users of the World Wide Web when presenting their data to the public and other scientists in the Feb. 11 issue of Science magazine. Fox and Hendler, both professors within the Tetherless World Research Constellation at Rensselaer, outline a new vision for the visualization of scientific data in a perspective piece titled “Changing the Equation on Scientific Data Visualization.”
Tetherless World Undergrad Research Program welcomes 10 students for Spring term!Tetherless World Undergrad Research Program welcomes 10 students for Spring term!
February 11, 2011
The Tetherless World Constellation (TWC) at RPI welcomes ten students to its Undergraduate Research Program for Spring 2011, its largest group yet!
Jeopardy - The IBM Challenge - WatsonJeopardy - The IBM Challenge - Watson
February 11, 2011
Joe Donahue speaks with John Kolb(Vice President for Information Services and Technology Chief Information Officer), Jim Hendler (Tetherless World Research Constellation professor of computer and cognitive science) and Chris Welty (Research staff member, IBM Watson Research Center) about Watson.
Rensselaer Data Scientists Collaborate on $2 Million Grant to Study OceansRensselaer Data Scientists Collaborate on $2 Million Grant to Study Oceans
February 8, 2011
Partnership with Woods Hole Oceanographic Institution to Provide Scientists with Important New Tools to Study Ocean Ecosystems

Peter Fox and Charles Stewart, data scientists at Rensselaer Polytechnic Institute, are beginning a large-scale collaboration with the Woods Hole Oceanography Institution (WHOI), utilizing a more than $2 million grant from the Gordon and Betty Moore Foundation.
Faculty, Staff, and Students to represent TWC at AGU's Fall Meeting 2010Faculty, Staff, and Students to represent TWC at AGU's Fall Meeting 2010
December 11, 2010
Professors Peter Fox and Deborah L. McGuinness, along with staff members Patrick West, Stephan Zednik, and Cynthia Chang and students Eric Rozell and Evan Patton, will be representing the Tetherless World Constellation at the American Geophysical Union's 2010 Fall Meeting. TWC will have seven presentations and three posters, and will be representing a poster by their Australian collaborators at the Commonwealth Scientific and Industrial Research Organization on the SWaMP project.
New Application Allows Scientists Easy Access to Important Government DataNew Application Allows Scientists Easy Access to Important Government Data
December 10, 2010
Computer scientists within the Tetherless World Research Constellation at Rensselaer Polytechnic Institute have developed an application to help solve the problem. A collaboration with scientific publisher Elsevier, the application utilizes the U.S. government data warehouse, Data.gov, to provide scientists with easy and direct access to government data sets relevant to their research.
TWC wins 2nd prize at Semantic Web Challenge 2010TWC wins 2nd prize at Semantic Web Challenge 2010
December 1, 2010
Li Ding accepts the second prize in the open track of the 2010 Semantic Web Challenge for the development of "TWC LOGD: A Portal for Linking Open Government Data."
Joanne Luciano Joins Renowned Web Science Research Group at RensselaerJoanne Luciano Joins Renowned Web Science Research Group at Rensselaer
November 24, 2010
Joanne Luciano Joins Renowned Web Science Research Group at Rensselaer Luciano brings expertise in health care and life science research to the Tetherless World Research Constellation.
NY Times blogs about Data.govNY Times blogs about Data.gov
November 18, 2010
Tetherless' own, James Hendler led, LOGD project was written about in The N.Y. Times technology blog "Bits".
White House Visitors app now available for iPhone and iPadWhite House Visitors app now available for iPhone and iPad
November 8, 2010
The White House Visitors application is now available as a mobile application for iPhone, iPad, and iPod touch.
Academic Minute - Deborah McGuinnessAcademic Minute - Deborah McGuinness
August 24, 2010
RPI Tetherless World Senior Constellation Professor Dr. Deborah McGuinness discusses ongoing research in the development of smarter phones.
Data.gov to Relaunch FridayData.gov to Relaunch Friday
May 19, 2010
The new site will also highlight third-party efforts to build Data.gov-based applications. For example, it will link to work done by a small team led by Prof. James Hendler at Rensselaer Polytechnic Institute to quickly build data visualization apps and mashups from government data sets, including visualizations of the White House visitor list, a map of ozone levels nationwide, and maps and visualizations of international aid levels.
Patrick West joins Tetherless World Constellation StaffPatrick West joins Tetherless World Constellation Staff
January 1, 2009

The Tetherless World Constellation is proud to welcome Patrick West to the Tetherless World Constellation as Senior Software Engineer

Patrick has worked as Software Engineer with Professor Peter Fox at University Corporation for Atmospheric Research (UCAR) since 2002 in the area of Semantic Representation of data for various solar terrestrial projects. Patrick has assisted in the design and development of the Virtual Solar Terrestrial Observatory with Professors Peter Fox and Deborah L. McGuinness.

Rensselaer Polytechnic Institute announces the launch of the Tetherless World ConstellationRensselaer Polytechnic Institute announces the launch of the Tetherless World Constellation
June 11, 2008
Rensselaer Polytechnic Institute hosts a round-table discussion with leading visionaries of the World Wide Web on the future of the Web. The panel includes Professor James Hendler, Semantic Web Visionary; Professor Deborah L. McGuinness, Web Language Expert; Sir Tim Berners-Lee, Inventor of the Web and Director of the World Wide Web Consortium; Wendy Hall, President-Elect of the ACM and Sr. V.P. of the Royal Academy of Engineering; Nigel Shadbolt, Former President of the BCS and CTO of Garlik; and Nova Spivak, High-Tech Entrepreneur.

The event takes place on the RPI campus, June 11, 2008 starting at 2:30pm in the Center for Biotechnology and Interdisciplinary Studies Auditorium.

Opening remarks will be given by RPI President Shirley Ann Jackson, Ph.D., and will be streamed live via the web.

This event marks the launch of the Tetherless World Constellation, a research constellation at RPI chaired by Professors James Hendler and Deborah L. McGuinness.