Archive

Archive for the ‘Blog’ Category

DCO-DS participation at Research Data Alliance Plenary 5 meeting

April 30th, 2015

In early March I attended the Research Data Alliance Fifth Plenary and “Adoption Day” event to present our plans for adopting DataTypes and Persistent Identifier Types in the DCO Data Portal. This was the first plenary following the publishing of the data type and persistent identifer type outputs and the RDA community was interested in seeing how early adopters were faring.

At the Adoption Day event I gave a short presentation on our plan for representing DataTypes in the DCO Data Portal knowledge base. Most of the other adopter presentations were limited to organizational requirements or high-level architecture around data types or persistent identifiers – our presentation stood out because we presented details on ‘how’ we intended to implement RDA outputs rather than just ‘why’. I think our attention on technical details was appreciated; from listening to the presentations it did not sound like many other groups were very far into their adoption process.

My main takeaways from the conference were the following:
– we are ahead of the curve on adopting the RDA data type and persistent identifier outputs
– we are viewed as leaders on how to implement data types; people are paying attention to what we are doing
– the chair of the DataType WG was very happy that we were thinking of how data types made sense within the context of our existing infrastructure rather than looking to the WGs reference implementation as the sole way to implement the output
– the DataType WG reference repository is more proof-of-concept then production system
– The data type community is interested in the topic of federating repositories but is not ready to do much on that yet

Overall I think we are well positioned to be a leader on data types. Our work to-date was very well received and many members involved in the DataType WG will be very interested in what more we have to show next September at the Sixth Plenary.

Good work team and let’s keep up the good work!

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: Blog, tetherless world Tags: ,

Open Source Software & Science Reproducibility

January 14th, 2014

This year my contribution to the AGU fall meeting 2013 was all about the development of Open Source Software to enable the reproducibility of scientific products, with both a Poster and an Oral presentation. The AGU was the perfect opportunity to share my ideas on a topic that is one of my main interests.

This was my 2nd time at AGU, but my first time with an oral presentation which turned in a real challenge!

The main issue was a combination of 2 factors : I had decided to generate the slideshow in realtime as HTML from an online IPython Notebook. I thought it would be cool to show this functionality, as well as the work itself. Unfortunately, I was dependent on an internet connection at the time of the presentation, but alas, at AGU the presenter computer doesn’t have internet connection! Definitely not the best conditions for a web based slideshow generated “on-the-fly” by the execution of an IPython Notebook.

I found out about the lack of connectivity only 2 days before my presentation. I must have misunderstood the AGU oral presentation guidelines, but when I didn’t find an explicit mention of the lack of an internet connection, I took it for granted that that wouldn’t be an issue. Big mistake!

I decided it would be safer to prepare a power-point presentation, and some time later, I had one. Deep breath; I would be safe. But… what a disappointment !

I was so excited about the idea of showing my work running in realtime instead of showing a static (somewhat boring) ppt  presentation!!!

I kept thinking about alternative solutions, though, and an idea quickly came to me. If the lack of internet stands in the way of an interactive, realtime demo there should be no problem in running a static HTML slideshows instead; at least that is what I thought …

I used the IPython “nbconvert” utility and its “convert to slide” option, and I successfully converted my workflow from an interactive IPython notebook running in slideshow mode to a static HTML5 slideshows, yeah! The audience wouldn’t get to see how this was done, but at least they would get to see the result.

Happy with the final HTML presentation I finally went to the “AGU’s Speaker Ready Room” to upload and test my presentation. Unfortunately, my HTML presentation would not run offline. The lack of internet was giving me troubles with missing JavaScript files, missing fonts, images-urls to be replaced with path to static files, broken hyperlinks etc … it was not as easy as I thought.

It took more than 3 hours to fix all the bugs on account of a really slow internet connection running from my phone, but finally i got my presentation perfectly  running off line on the AGU computers !

In the end, my talk ran very smoothly. A complete workflow for “catchments characterization” using exclusively open source software, running online and fully reproducible thanks to the use of open source software and an open dataset! I felt really good, as I think I successfully got my message across, both in words and in actions.

To top it all off, my presentation came just at the right time. Before me, two other presentations during my session had mentioned the use of the IPython Notebook as open source software tool to enable reproducibility of scientific work. They had highlighted that it shows great potential and that it deserves further investigation. I think my presentation gave them even more proof of that! Even the chairman acknowledged this when he stated: “Before we heard about it, but now we saw it in action!” I felt very proud of what I had done. The effort I put into running the HTML slideshow definitely paid off!!!

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

ESIP Winter Meeting 2013

January 24th, 2013

I presented a poster about how can we use Semantic Web Technology to help building an information model and information system for National Climate Assessment (NCA). NCA is a report which integrates, evaluates, and interprets the findings of climate change and impacts on affected industries such as agriculture, natural environment, energy production and use, etc. One of the problem of building an information system for NCA is that NCA uses information from wide range of information sources and covered many climate related topic, this makes it difficult for users to find and identify information they needed. Using Semantic Web Technology, we created a well-structured ontology, where relationship between NCA-realted entities, concepts are well-defined. We also use other Semantic Technologies such as Prov-O, SPARQL-endpoints, and ontology-based facet browsers can to help solve the problem.

Overall, the presentation went well. Few people found particular interest on how we leverages GCMD keywords and Clean Vocabulary when building the information system for NCA, and what benefits can they bring to NCA report information system. Other people also found Facet Search System interesting, especially on applying it on the Geo-related data.

I also attended Semantic Web-related sessions. One of them was discussing how can we use Semantic Web Technology to help solving “Tool Match” problem. Using OWL ontology to encode the rules and concepts about tools and datasets, and then use description logic reasoners to perform Tool Match. Another one was giving tutorial on Semantic Web technology to the ESIP community. It is nice to see Semantic Web Technology really helps different communities to solve various problems and people are becoming more and more interested on this technology.

VN:F [1.9.22_1171]
Rating: 8.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: Blog, Data Science, Semantic Web, tetherless world Tags:

Fall 2010 TWC Undergraduate Research Summary

December 20th, 2010

The Fall 2010 semester marked the beginning of the Tetherless World Constellation’s undergraduate research program at Rensselaer Polytechnic Institute (RPI). Although TWC has enjoyed significant contributions from RPI undergrads since its inception, this term we stepped up our game by more “formally” incorporating a group of undergrads into TWC’s research programs, established regular meetings for the group, and with input from the students began outfitting their own space in RPI’s Winslow Building.

Patrick West, my fellow TWC undergrad research coordinator and I asked the students to blog about their work throughout the semester; with the end of term, we asked them to post summary descriptions of their work and their thoughts about the fledgling TWC undergrad research program itself. We’ve provided short summaries and links to those blogs below…

  • Cameron Helm began the term coming up to speed on SPARQL and RDF, experimented with several of the public TWC endpoints, and then worked with Phillip on basic visualizations. He then slashed his way through the tutorials on TWC’s LOGD Portal, eventually creating impressive visualizations such as this earthquake map. Cameron is very interested in the subject of data visualization and looks to do more work in this area in the future.
  • After a short TWC learning period, Dan Souza began helping doctoral candidate Evan Patton create an Android version of the Mobile Wine Agent application, with all the amazing visualization and data integration required, including Twitter and Facebook integration. Mid-semester Dan also responded to the call to help with the crash” development of the Android/iPhone TalkTracker app, in time for ISWC 2010 in early November. Dan continues to work with Evan and others for early 2011 releases of Android, iPhone/iPad Touch and iPad versions of the Mobile Wine Agent.
  • David Molik reports that he learned web coding skills, ontology creation, server installation and administration. David contributed to the development and operation of a test site for the new, semantic web savvy website for the Biological and Chemical Oceanography Data Management Office BCO-DMO of the Woods Hole Oceanographic Institute.
  • Jay Chamberlin spent much of his time working on the OPeNDAP Project, an open source server to distribute scientific data that is stored in various formats. His involvement included everything from learning his way around the OPeNAP server, to working with infrastructure such as TWC’s LDAP services, to helping migrate documentation from the previous Wiki to the new Drupal site, to actually implementing required changes to the OPeNDAP code base.
  • Phillip Ng worked on a wide variety of projects this fall, starting with basic visualizations, helping with ISWC applications, and including iPad development for the Mobile Wine Agent. Phillip’s blog is fascinating to read as he works his way through the challenges of creating applications, including his multi-part series on implementing the social media features.
  • Alexei Bulazel began working with Dominic DiFranzo on a health-related mashup using Data.gov datasets and is now working on a research paper with David on “human flesh search engine” techniques, a topic that top thinkers including Tetherless World Senior Constellation Professor Jim Hendler have explored in recent talks. Note: For more background on this phenomena, see e.g. China’s Cyberposse, NY Times (03 Mar 2010)

Many of these students will be continuing on with these or other projects at TWC in 2011; we also expect several new students to be joining the group. The entire team at the Tetherless World Constellation thanks them for their efforts and many important contributions this fall, and looks forward to being amazed by their continued great work in the coming year!

John S. Erickson, Ph.D.

VN:F [1.9.22_1171]
Rating: 9.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Timeline of ISWC 2010 Main Conference Talks

November 1st, 2010

This is another visualization using Datapress
It shows talks at the main conference of ISWC 2010.

VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: Blog, iswc, tetherless world, visualization Tags: