July 27th, 2015

The ESIP 2015 Summer Meeting was held at Pacific Grove, CA in the week of July 14-17. Pacific Grove is such a beautiful place with the coast line, sand beach and sun set. What excited me more are the science and technical topics covered in the meeting sessions, as well as the opportunity to catch up with friends in the ESIP community. Excellent topics + a scenic place + friends = a wonderful meeting. Thanks a lot to the meeting organizers!

The theme of this summer meeting is “The Federation of Earth Science Information Partners & Community Resilience: Coming Together.” Though my focus was Semantic Web and data stewardship relevant sessions, I was able to see the topic ‘resilience’ in various presented works. It was nice to see that the ESIP community has an ontology portal. It implements the Bio Portal infrastructure and focuses on collecting ontologies and vocabularies in the field of Earth sciences. With more submissions from the community in the future the portal has great potential for geo-semantics research, similar to what the Bio Portal does for bioinformatics. An important topic was reviewing progress and discussing directions for the future. Prof. Peter Fox from RPI offered a short overview. The ESIP Semantic Web cluster is nine years old, and it is nice to see that through the cluster has helped improve the visibility of semantic web methods and technologies in the grand field of geoinformatics. A key feature supporting the success of Semantic Web is that it is an open world and it evolves and updates.

There were several topics or projects of interest that I recorded during the meeting:

(1) schema.org: It recently released version 2.0 and introduced a new mechanism for extension. There are now two types of extensions: reviewed/hosted extensions and external extensions. The former (e1) gets its own chunk of schema.org namespace: e1.schema.org. All items in that extension are created and maintained by their own creators. The latter means a third party to create extensions specific to an application. Extensions to location and time might be a topic for the Earth science community in the near future.

(2) GCIS Ontology: GCIS is such a nice project it is incorporated several state-of-the-art Semantic Web methods and technologies. The provenance representation in GCIS means it is not just a static knowledge representation. It is more about what are the facts, what do people believe and why. In the ontology engineering for GCIS we also see the collaboration between geoscientists and computer scientists. That is, conceptual model came first, as a product that geoscientists can understand, before it was bound to logic and ontology encoding grammar. The process can be seen as within the scope of semiology. We can do good jobs with syntax and semantics, and very often we will struggle with the pragmatics.

(3) PROV-ES: Provenance of scientific findings is receiving increasing attending. Earth science community has taken a lead on working of capturing provenance. The World Wide Web Consortium (W3C) PROV standard provide a platform for Earth science community to adopt and extend. The Provenance – Earth Science (PROV-ES) Working Group was initiated in 2013 and it primarily focused on extending the PROV standard, and tested the outputs with sample projects. In the PROV-ES hackathon at the summer meeting, Hook Hua and Gerald Manipon showed more technical details of with PROV-ES, especially about its encodings, discovery, and visualization.

(4) Entity linking: Jin Guang Zheng and I had a poster about our ESIP 2014 Test bed project. The topic is about linking entity mentions in documents and datasets to entities in the Web of Data. Entity recognition and linking is a valuable work in works with datasets collected from multiple sources. Detecting and linking entity mentions in datasets can be facilitated by using knowledge bases on the Web, such as ontologies and vocabularies. In this work we built a web-based entity linking and wikification service for datasets. Our current demo system uses DBPedia as the knowledge base, and we have been collecting geoscience ontologies and vocabularies. A potential future collaboration is to use the ESIP ontology portal as the knowledge base. Discussion with colleagues during the poster session shows that this work may also be beneficial to works on dark data, such as pattern recognition and knowledge discovery from legacy literature.

(5) Big Earth Data Initiative: This is an inter-agency coordination work for geo-data interoperability in US. I would copy paste a part of the original session description to show the detailed relationships about a few entities and organizations that were mentioned: ‘The US Group on Earth Observations (USGEO) Data Management Working Group (DMWG) is an inter-agency body established under the auspices of the White House National Science and Technology Council (NSTC). DMWG members have been drafting an “Earth Observations Common Framework” (EOCF) with recommended approaches for supporting and improving discoverability, accessibility, and usability for federally held earth observation data. The recommendations will guide work done under the Big Earth Data Initiative (BEDI), which provided funding to some agencies for improving those data attributes.’ It will be nice to see more outputs from this effort and compare the work with similar efforts in Europe such as the INSPIRE, as well as the global initiative GEOSS.

