A few weeks ago I attended the 2014 Geodata Workshop. Like the previous Geodata workshop in 2011, this workshop was focused on discussing policies and techniques to improve inter-agency geographic data integration and data citation. While there have been advances in recommendations for data citation and geodata integration since the last Geodata workshop, I felt the mood of the attendees indicated that we are now in much the same place we were in 2011. There was strong consensus as to the importance of data citation and integration, but a feeling that no one is really doing it at scale, the tools aren’t where we need them to be, and the agency policies are not yet at a state to successfully drive widespread adoption. Despite these hurdles this is a community that is clearly excited and willing to take the first steps towards making widespread data integration and data citation a reality in the geodata community.
Meanwhile, in the trenches…
I had several conversations with attendees who represent publishers of oceanographic vocabularies. Many of these vocabularies have been publicly available for several years, but have been traditionally been 3-star open data (publicly available in a non-proprietary machine-readable format, no links to external vocabularies). These publishers are excited about upgrading their vocabulary services to be 5-star open data (use open W3C standards such as RDF/SPARQL, identify things with resolvable URIs, link to other people’s data) because they see a major benefit in being able refer to the authoritative source for a term or identified resource that is related to their vocabulary but for which they are not the authoritative source. This is a great example of a group that has already identified a specific real-world need and benefit from integration and who are actively laying the groundwork that will enable that integration to be successful. This group was enthusiastic about cross-linking their vocabluaries and I have no doubt their efforts will be viewed as a data integration success at the next Geodata workshop.
Where we can help…
As a result of these discussions our lab is starting a Linked Vocabulary API effort whose goal is to provide a Linked Data API configuration specialized to the purpose of publishing SKOS vocabularies. Our goal is to develop a configuration that makes bootstraping a RESTful linked data API to a SKOS vocabulary simple and accessible for the broad scientific community. This effort is based on work we previously did for the CMSPV project.
What I will remember most from Geodata 2014 is the excitment members of the community had towards adopting new technologies and techniques and making widespread data integration and citation a reality. Where conventions have yet to be established the community is willing to take the first steps and establish best practices. Where policies have yet to be formalized the community is ready to work with policy makers to ensure clear and helpful policies are established . Whenever the next Geodata workshop is held, I am confident that it’s narrative will be full of success stories that began at the 2014 workshop.