Archive

Author Archive

Another AGU and we all get wet from the rain in San Fran…

January 10th, 2015

The 2014 Meeting of the American Geophysical Union in the wet city of San Francisco has not yet faded from memory. Unfortunately, it may be remembered for the “year of the RFID mess” over the great science progress. However, let’s start with the positive. Rensselaer’s Tetherless World was well represented – see what we did at http://tw.rpi.edu/web/event/AGU/FM/2014/Participation = Patrick, Stephan, Marshall, Evan and Paulo (representing others including Linyun and Han) in talks, posters covering both research and project progress, and the academic booth (go RPI!). This year, we presented in Informatics (IN) and Education (ED) sessions with talks and many posters. Just on a logistics note, I was very pleased to have the exhibit hall adjoined to one of the poster halls this year. This made the task of moving between them and not missing one or the other, much easier. Hope that continues. It was another excellent year for Informatics; I’ve misplaced the stats but suffice to say increasing numbers of abstracts, great student contributions and a sea of new faces. A continuing treat is the Leptoukh Lecture (honouring Greg L, whom I still miss very much). This year, Dr. Bryan Lawrence (working in the UK, but actually a Kiwi) gave a tour de force lecture on computation and data aspects of climate science. The attendance was excellent, clearly pulling in a wide cross-section of attendees from well beyond the IN folks. Thanks Bryan. This year was the change over for Informatics leadership with Kerstin Lehnert taking over from Michael Piasecki as President – thanks Michael for your leadership and efforts over the last two years. Ruth Duerr (NSIDC) came in as President-Elect and Anne Wilson (CU/LASP) as secretary. Diversity rules in Informatics!!!

In regard to IN poster sessions, we saw an increase in the flash mob approach. What is that you ask? It is where, at an appointed time during the poster session, the session convener arranges for all poster presenters to be present. After having also advertised by twitter, email and general coercion, they gather poster attendees around each poster (in order, down the row). The presenter has 5 minutes to present their poster and then the mob moves on. It has shown to be a very effective way of engaging attendees and the presenters. If the session organiser has pre-planned it, the sequencing can also be very effective. After each has been presented, may attendees stay to quiz specific posters they were interested in. The one aspect that makes this style hard is the general noise level in the poster hall. Poster presenters need to “speak up” and project their voice: not all are prepared for that but it is very good practice!

I am author / co-author on quite a few presentations each year. This year I had two posters (both invited) as lead. You can see them via the link above. Sixth generation of data and information architectures, and Anatomy and Physiology of Data Science drew quite a lot of interest. But I must say, I did enjoy getting to stand with Mark Parsons at our poster “Why Data Citation Misses the Point” (I will add that to the website) and elaborate on our premise. Interestingly, we had a lot of agreement with the work — we’d hope to provoke arguments (!! as usual !!). Now to find time to write that up.

I want to acknowledge the excellent presentation of other works I was co-author on. The TWCers noted above are indeed skilled and knowledgeable researchers and practitioners. I know that but it is always excellent to have peers approach me to tell me that and how impressed they are with both the work and the people!

And the RFID issue – just go here and see for yourselves: http://petitions.moveon.org/sign/say-no-to-rfid-tracking.fb47

See all of you next December.

 

VN:F [1.9.22_1171]
Rating: 9.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: Data Science, Semantic Web, tetherless world Tags:

American Geophysical Union Informatics

December 22nd, 2014

fn2wgd5bim

VN:F [1.9.22_1171]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Revised article posted: Is Data Publication the Right Metaphor

September 19th, 2012

After a few gyrations and reading 70 pages of reviews, Mark (taking the lead) and I have now re-submitted our paper to the Data Science Journal. We’d like the conversation to continue and I’ll be posting followup material here. See http://mp-datamatters.blogspot.com/2012/09/revised-metaphor-paper-submitted.html

VN:F [1.9.22_1171]
Rating: 7.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: tetherless world Tags:

Is Data Publication the right metaphor?

December 15th, 2011

http://mp-datamatters.blogspot.com/2011/12/seeking-open-review-of-provocative-data.html

VN:F [1.9.22_1171]
Rating: 6.6/10 (5 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: tetherless world Tags: , ,

Why the term ‘data publication’?

December 14th, 2010

Over the last 6 months I have been present in at least 10 distinct discussions around topics such as data publication, data citation and data attribution. At first I was engaged in the topics but very quickly I kept pausing and asking myself, what’s the use case (duh!). What I was hearing was coming from ‘data people’ (yes, I am one of them). What I wanted to hear was: “I want to be cited for the datasets I spend a lot of time and intellectual effort collecting, calibrating and analyzing”, or “… really I want to get credit for that as much as the one or two publications I might get”. I’ve heard this, in fact I’ve said it myself many times. So what’s the problem? Well, when a researcher wants credit and citation for a piece of work, they prepare and publish a paper, yes a body of intellectual work. Our communities and disciplines have spent many centuries developing this approach. So, if want I really want is credit and citation for my data, why do I need to publish it? At present, many people are getting such credit but it is an informal way such as narrative level acknowledgement in the text of the paper and not formal (Parsons, Duerr and Minster 2010 EOS). That’s as good as no acknowledgement unless someone sees it and records it somewhere. The mechanism for paper citation is now well established, I cite your paper in my paper and your citation count increases and gets reported. If you are up for promotion or tenure or review and that count is taken into account, you get credit. It’s the identification of the artifact that counts not the fact that it is published. In short, the capability that is needed is: a way to identify your data contribution and a way to record it (and thus count it). Identification and reference, that’s it. Now, I am not writing about ‘publication data publication’, i.e. the data that is the foundation for figures, tables, and other descriptions in a published paper. I am all for that data being made available as a part of the publication. That is also another story. I am addressing just regular data (collections/ sets).

For now I am suggesting that there are other models to make data available to start with, and one of them is the software release cycle/ process. Alpha, pre-beta, beta, release candidate, release, revision, documentation, feedback, bug fixes… it is more like the process for data that I know of. Now, this may not be the right approach but I think we should explore it, and others. I’m no longer in favour of just adopting a model (marriage) of convenience (publishing). We are savvy enough to take a step back and implement a model that meets the needs of the data scientists who deserve it most. Yes, there’s more to be said. Tag your it.

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)