Data science day experience
The two day workshop is transient but really impressive, as I learned a lot more about how domain scientist are playing with their data and what their requirements are.
The first day starts with a series of presentations from the secretaries of the DCO steering team. Most of the talk are about how they are managing their data and the ratianale behind the scene. As I am with the background of computer science and we had seen lots of paradigms in managing dataset, and their approaches are nothing new to us. However, it is still interesting to see that a great proportion of the audiences are still impressed by these ways of doing science. I would say part of the purpose of this workshop is to persuade the domain scientist to use the information platform instead of being satisfied with their current working style. Among the big shot’s presentations, Kerstin Lehnert presented their work for geo-chemical data management platform which is one of the most widely used ones. They also brand a glocal identifier, which is IGSN, to give every data sample a uniquely dereferencable address. In this sense, what they have done is much similar to our goal, however in a slightly different domain. I guess part of the reason of this talk is just to send domain scientists a signal that what we have done for the DCO will be significant in terms of improving productivity in the realm of deep carbon observatory.
The group discussion is the most interesting part. I was assigned to the deep energy and deep life group and the domain scientist are really trying to challenging us with all sorts of requirements. Some of the requirements are easy to handle, while others are more like asking for a google, facebook and amazon for their domain of research. They simply want to find their dataset for research through intuitive query, publish their new findings to their peers and get useful dataset updated recommended by the system if there is ever something that satisfy their requirements according to their lookup history. Definitely, these are all good suggestions. But the first step is still to have people upload some dataset over there. Han and I try to explain to them this preliminary requirements, and demoed a dataset upload to them. Suddenly, they realized that it is not difficult at all, plus there are a lot of fun in branding themselves in some place that is also very visible in their research community.
The second day is really nervous to me. As we would have a numbers of new users coming to our DCO-VIVO platform and hang around. Most of the part are working fine, however we did have some quick fix in order to make the website function properly. Currently, we didn’t really have much active users outside data science team, however the demo day is a good starter, which give us a chance to help our potential users to contribute to the network by uploading some of their research thesis or dataset.
Even though the VIVO platform is the state-of-arts for semantic social network, it doesn’t mean that the user interface is optimally customized for different group of users. There are still lot of work to be done in order to attract more users, especially the user interface. That’s why we are still working on customizing for the DCO. Let’s keep shipping!