Archive

Archive for the ‘Web Science’ Category

Fall 2010 TWC Undergraduate Research Summary

December 20th, 2010

The Fall 2010 semester marked the beginning of the Tetherless World Constellation’s undergraduate research program at Rensselaer Polytechnic Institute (RPI). Although TWC has enjoyed significant contributions from RPI undergrads since its inception, this term we stepped up our game by more “formally” incorporating a group of undergrads into TWC’s research programs, established regular meetings for the group, and with input from the students began outfitting their own space in RPI’s Winslow Building.

Patrick West, my fellow TWC undergrad research coordinator and I asked the students to blog about their work throughout the semester; with the end of term, we asked them to post summary descriptions of their work and their thoughts about the fledgling TWC undergrad research program itself. We’ve provided short summaries and links to those blogs below…

  • Cameron Helm began the term coming up to speed on SPARQL and RDF, experimented with several of the public TWC endpoints, and then worked with Phillip on basic visualizations. He then slashed his way through the tutorials on TWC’s LOGD Portal, eventually creating impressive visualizations such as this earthquake map. Cameron is very interested in the subject of data visualization and looks to do more work in this area in the future.
  • After a short TWC learning period, Dan Souza began helping doctoral candidate Evan Patton create an Android version of the Mobile Wine Agent application, with all the amazing visualization and data integration required, including Twitter and Facebook integration. Mid-semester Dan also responded to the call to help with the crash” development of the Android/iPhone TalkTracker app, in time for ISWC 2010 in early November. Dan continues to work with Evan and others for early 2011 releases of Android, iPhone/iPad Touch and iPad versions of the Mobile Wine Agent.
  • David Molik reports that he learned web coding skills, ontology creation, server installation and administration. David contributed to the development and operation of a test site for the new, semantic web savvy website for the Biological and Chemical Oceanography Data Management Office BCO-DMO of the Woods Hole Oceanographic Institute.
  • Jay Chamberlin spent much of his time working on the OPeNDAP Project, an open source server to distribute scientific data that is stored in various formats. His involvement included everything from learning his way around the OPeNAP server, to working with infrastructure such as TWC’s LDAP services, to helping migrate documentation from the previous Wiki to the new Drupal site, to actually implementing required changes to the OPeNDAP code base.
  • Phillip Ng worked on a wide variety of projects this fall, starting with basic visualizations, helping with ISWC applications, and including iPad development for the Mobile Wine Agent. Phillip’s blog is fascinating to read as he works his way through the challenges of creating applications, including his multi-part series on implementing the social media features.
  • Alexei Bulazel began working with Dominic DiFranzo on a health-related mashup using Data.gov datasets and is now working on a research paper with David on “human flesh search engine” techniques, a topic that top thinkers including Tetherless World Senior Constellation Professor Jim Hendler have explored in recent talks. Note: For more background on this phenomena, see e.g. China’s Cyberposse, NY Times (03 Mar 2010)

Many of these students will be continuing on with these or other projects at TWC in 2011; we also expect several new students to be joining the group. The entire team at the Tetherless World Constellation thanks them for their efforts and many important contributions this fall, and looks forward to being amazed by their continued great work in the coming year!

John S. Erickson, Ph.D.

VN:F [1.9.13_1145]
Rating: 9.0/10 (1 vote cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)

The Web as Critical Infrastructure

June 29th, 2010

I  recently was asked to talk about Web Science to the US President’s Innovative Technology Advisory Committee. My comments focused on the role of the Web as a critical infrastructure, and on the need for us to better understand it.  More on what I said and my written remarks can be found on my Nature.com blog at:

http://blogs.nature.com/jhendler/2010/06/30/the-web-is-a-critical-infrastructure—we-must-understand-it

VN:F [1.9.13_1145]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)
Author: Categories: personal ramblings, Web Science Tags:

Big Data for the Cloud and the Crowd

April 1st, 2010

Researchers have been long starving for big data to improve the excellence of their research. Nowadays big data is no longer a dream but something real on the Web: increasing amount of data is becoming available for public access from research communities, individuals, government agencies and etc. So what does such big data mean to the web users and how can we best use it? Following are some potential benefits from big data.

“Make sense of what have been known”. Scientific research is growing in a progressive manner, and scientific discoveries are founded on the knowledge we known in the past. In order to avoid reinventing the wheel, we should preserve our knowledge on what we have known as part of big data and make them available to ongoing research. Currently, keyword search, such as Google Scholar, has successfully helped researchers to retrieve previous research work. Moreover, well organized knowledge about the past research is wanted to provide users a systematic and accurate way to access past work. With better knowledge on what has been done, user can better identifying promising research directions and approaching new discoveries.

“Support hypothesis generation and testing”. With big data in hand (or public accessible), not only scientists but the general public users can start thinking more on the hypothesis, including theoretical models and pop-science questions. A humble use of big data would be that users use an interactive application to conveniently aggregate distributed big data and then invent or evaluate their hypotheses on big data. On step forward would be the usage of powerful AI technology (especially statistical methods) on big data to help users identify similar/unique data/hypotheses, prioritize potentially interesting candidate hypotheses and even come up with new hypothesis.

“Support persistence and accountability”. If big data are going to be the foundation for massive scientific research and public use, reliable data availability is needed by all applications that depend on the data. Meanwhile, without effective accountability mechanisms over the distributed and shared big data, conclusions derived from the big data may not be trusted.

In order to realize the benefits, the emerging Web Science seems very promising as it is bringing many interesting opportunities to deal with the big data:

“Linked Data” [1]. Big data is not merely a massive collection of information islands bounded by their physical locations, and the value of big data can be greatly increased if there are effectively linked (or networked). Similar to the hyperlinks on the Web, it is very important to turn implicit inter-data connections into declarative ones and get links available as part of big data: a person’s medical records can be linked across different clinics and hospitals, demographic state statistics (e.g. livestock and gross income tax) can be linked across different government agencies [2], and information about a disease can be linked to entries at GenBank.

“Social Machine” [3]. Big data should also interact with human society. Crowd sourcing, such as Wikipedia and Web rating systems, has been seen adding huge value to the knowledge on the Web. However, that is not yet the ultimate vision. We can indeed combine the power of machine and human to build the social machine: cloud computing, such as Google search and Microsoft recently announced Web n-gram service, are offering great computing power for processing massive data, and crowd sourcing, such as Wikipedia, can distribute the cost for solving hard problems to massive human intelligence on the Web and supply high quality results. The social machine also supports interactive problem solving: there is a feedback loop between the cloud and the crowd, and the consumers can feedback comments and enhancements to the publisher.

“Knowledge Provenance”[4,5]. Big data are often integrated when being used. Declarative knowledge provenance (e.g. audit trace) is the foundation of transparency of distributed data processing. Computations on provenance data are the keys to accountability, e.g. a policy framework to assure proper use of digital information and some trust mechanisms to assure credibility of reused data.

References

[1] Tim Berners-Lee, Linked Data, 2007 http://www.w3.org/DesignIssues/LinkedData.html

[2] Li Ding, Dominic Difranzo, Alvaro Graves, James Michaelis, Xian Li, Deborah L. McGuinness,Jim Hendler, Data-gov Wiki: Towards Linking Government Data, in Proceedings of the AAAI Spring Symposium on Linked Data Meets Artificial Intelligence, 2010, http://data-gov.tw.rpi.edu/2010/linkedai-2010-datagov.pdf

[3] J. Hendler, T. Berners-Lee, From the semantic web to social machines: A research challenge for AI on the World Wide Web, Artificial Intelligence (2009), http://dx.doi.org/10.1016/j.artint.2009.11.010

[4] Deborah L. McGuinness and Li Ding and Paulo Pinheiro da silva and Cynthia Chang. PML 2: A Modular Explanation Interlingua. in Proceedings of the AAAI’07 Workshop on Explanation-Aware Computing, 2007, http://www.ksl.stanford.edu/KSL_Abstracts/KSL-07-07.html

[5] Li Ding, Provenance and Search Issues in RDF Data Warehouse, in Proceedings of SemGrail Workshop, 2007, http://research.microsoft.com/en-us/events/semgrail2007/lid_position.pdf

Li Ding,  April 1, 2010

VN:F [1.9.13_1145]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)
Author: Categories: linked data, Web Science Tags:

Semantic Web for the Working Ontologist: Japanese Preface

January 19th, 2010

Dean and I were very pleased to learn that “Semantic Web for the Working Ontologist” is being published in Japanese.  We were asked to write a preface for the Japanese version — since it will only appear in print in Japanese, I thought I’d share it here in English (pretranslation):

We are very pleased to be able to write this new Preface introducing the Japanese translation of our book. Japanese researchers have been involved in Semantic Web technologies since the very early days, and we are honored that our book has been chosen for translation and republication to make it more accessible to the Japanese audience.

In the less than two years since this book was published, we have seen a large growth of interest in the Semantic Web and the new Web applications it makes available. This includes the commercial interest in new enterprise solutions, in new ways to bring data to the Web, and in the large-scale “Web 3.0” applications that can be enabled by combining Semantic Web data with other Web applications. New terms such as “semantic search,” “intelligent match,” and “virtual personal assistant” are starting to make it out of the laboratories and into the world of Web startups. Turning the mass of data available through the Web into useful knowledge increasingly demands new techniques and new technologies to succeed, and the Semantic Web is becoming more recognized as an important player in the growing Web world.

One of the reasons for the increasing interest in these technologies is the lack of success of that “folksonomies” and Web 2.0 approaches have had in stemming the growing tide of Web information. In fact, just the opposite – new media such as blogs, social networks and twitter™ have led to people spending more and more time on the Web, but with less and less ability to find specific things they need. Without semantics, the Web is turning into a wonderful wonderland for entertainment, but less and less a productive space for solving the real problems being faced by people, companies and governments in today’s increasingly complex world.

As this interest has grown, it has also been becoming clear that critical to the successful application of these technologies is an ability to model at some level. To get a first demo up and running is not hard, but just as a real application of a data base must include a data model, so must a real application using semantic technologies include a model of the information of interest – an ontology. In this book, we provide you with the background necessary to begin to understand, and build, Semantic Web ontologies. As our title implies, our goal is to help the “working ontologist” – with our focus on the practice, rather than the theory, of Semantic Web development. We focus on the “how,” rather than the “why,” so as to enable you to better understand how to use these important new technologies.

We appreciate your welcoming us into the Japanese marketplace. We particularly thank the translators who are helping us bring the book into your language and the developers of the use cases added to this Japanese edition of the book so as to better show how these technologies are already having an impact in Japan. We thus hope this translation of our book will further your ability to develop innovative applications both within Japan and in the increasingly global economy.

Dean Allemang and Jim Hendler, 1/1/2010

VN:F [1.9.13_1145]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)
Author: Categories: Semantic Web, Web Science Tags:

My Personal (unofficial) Semantic Web FAQ — a pointer

September 1st, 2009

The joy of multiple blog sites is having to post pointers to one blog entry from another.

My blog at nature.com now has an entry entitled “The Semantic Web: My personal (unofficial) FAQ” which lives at http://network.nature.com/people/jhendler/blog/2009/08/03/the-semantic-web-my-personal-unofficial-faq. Comments, and especially your suggestions for Qs and As are more than welcome there or here (or anywhere else for that matter)

Cheers,

Jim H.

VN:F [1.9.13_1145]
Rating: 5.0/10 (1 vote cast)
VN:F [1.9.13_1145]
Rating: 0 (from 0 votes)