You have to log in to edit pages.
Form editing notes
Free text:
{{#smartset:start date=April 9, 2009}} {{#smartset:end date=April 10, 2009}} {{#smartset:sponsor=MBLWHOI Library,Jewett Foundation|list}} ==Login== location: http://tw.rpi.edu/portal/Jewett_Meeting_at_MBL '''Shared wiki login account''': Jewett (password: please contact baojie@cs.rpi.edu and dingl@cs.rpi.edu) '''Create your own account''': * Please go to http://tw.rpi.edu/proj/portal.wiki/index.php?title=Special:UserLogin&type=signup * Fill up the form, please select "Your domain" to be "'''local'''" * Go back to the [[Jewett Meeting at MBL]] page For a brief wiki editing tutorial, see [http://www.youtube.com/watch?v=6gbMNhnl1SU] (Youtube, 4 mintues) ==Attendees== {| border=1 class="sortable" ! name ! email ! affiliation |- | Alice Orton || aorton@usgs.gov || usgs |- | Andy Maffei || amaffei@whoi.edu || whoi |- | Anthony Goddard || agoddard@mbl.edu || mbl |- | Arcot Rajasekar || rajaseka@email.unc.edu || unc |- | Art Gaylord || agaylord@whoi.edu || whoi |- | Arthur Newhall || anewhall@whoi.edu || whoi |- | Bob Groman || rgroman@whoi.edu || whoi |- | Cathy Norton || cnorton@mbl.edu || mbl |- | Cyndy Chandler || cchandler@whoi.edu || whoi |- | Deborah McGuinness || dlm@cs.rpi.edu || rpi |- | Diane Rielinger || drielinger@mbl.edu || mbl |- | Ed Urban || ed.urban@scor-int.org || scor |- | Gary Miller || gmiller@usgs.gov || usgs |- | Holly Miller || hmiller@mbl.edu || mbl |- | Jennifer Schopf || jms@nsf.gov || whoi/nsf |- | Kerstin Lehnert || lehnert@ldeo.columbia.edu || columbia |- | Li Ding || dingl@cs.rpi.edu || rpi |- | Lisa Raymond || lraymond@whoi.edu || whoi |- | Patrick West || westp@rpi.edu || rpi |- | Peter Fox || pfox@cs.rpi.edu || rpi |- | Peter Wiebe || pwiebe@whoi.edu || whoi |- | Ryan Schenk || rschenk@mbl.edu || mbl |- | Stephen Miller || spmiller@ucsd.edu || ucsd |- | Stephan Zednik || zednis@rpi.edu || rpi |- | Tom Moritz || moritz@archive.org || internet archive |- | Vicki Ferrini || ferrini@ldeo.columbia.edu || columbia |} ==Agenda== ===Thursday, April 9=== Campfire chat room: <s>https://mblwhoilibrary.campfirenow.com/37e51</s> Raw Campfire transcripts: [[Media:09April2009chat.doc | April 9th Chat]] [[Media:10April2009chat.doc | April 10th Chat]] 2:00 pm Pre Conference : Praciticum Team meets and goes over their experience/Carriage House. (team members only) 3:30 pm Shuttle Service from Inn on the Square to Jonsson Center 3:45 pm Shuttle Service from Inn on the Square to Jonsson Center ===Conference Starts=== 4:00 pm Coffee/Tea, Workshop begins: Carriage House [[image:mbl20090409seat.png|400px]] 4:00-5:00 pm Keynote by Deborah McGuinness (RPI) 5:00-5:30 pm [[Media:Cchandler DPW 090409.ppt|Challenges]] by Cyndy Chandler (WHOI) 5:30-6:00 pm Goals - discussion Andy Maffei (WHOI)/Cathy Norton (MBLWHOI Library) * focus is only on data behind a published journal article * examine attribution stream for this data, how is it cited? * examine where do you store the metadata about this data? * where do you store the data? * what metadata is required around the metadata? 6:00 pm Cocktails and Dinner-- Main House ===Friday, April 10th=== 7:30 am Shuttle Service from Inn on the Square to Jonsson Center 7:45 am Shuttle Service from Inn on the Square to Jonsson Center 7:30-8:30 am Breakfast at Jonsson Center / Main House '''Jonsson Center / Carriage House''' 8:30-9:00 am [[Media:Jewett_powerpoint.ppt|Data Library]] by Lisa Raymond (MBLWHOI Library) 9:00-9:30 am [[Media:WHOI-provenance.ppt|Persistent Archives: Long Term Sustainability of data based on policy and data virtualization]] by Arcot Rajasekar (UNC) 9:30-10:00 am [[Media:Schopf_data_April_2009.ppt|NSF Office of CyberInfrastructure : What Are We Thinking About Data]] by Jennifer Schopf (NSF) 10:00-10:30 am Break 10:30-Noon Practicum - Use Cases Noon Lunch - Jonsson Center/ Main House 1:00-1:30 pm [[Media:Jewett_Fox20090409_Standards.ppt|Data Standards, Better Practices: US and others]] by [[Peter Fox]] (RPI) 1:30-3:00 pm - Use cases continued - followed by breakouts if necessary 3:00-3:30 pm Break 3:30-6:00 pm Consensus on Best Practices.... and work on white paper resulting from discussions. 6:00 pm CLAMBAKE at Jonsson Center / Main House ===Meeting Notes and Slides=== [[Data Library by Lisa Raymond (MBLWHOI Library)]] [[Persistent Archives: Long Term Sustainability of data based on policy and data virtualization by Arcot Rajasekar (UNC)]] ===Post-meeting Documents=== [[Consensus on Best Practices|Transcribed Easel Sheets]] from Best Practices discussion [[Media:Draft_World_Data_Center_Certification_Criteria.pdf| Draft World Data Center Certification Criteria]] [[Media:Newhall_jewett_data_workshop.pdf|Arthur Newhall's Observations]] ==Use Cases== ===Use Cases=== * UseCase #1 for Group Discussion: A scientist wants to find all tables and figures in papers published the SW06 dataset that have have sound speed profiles in them. * UseCase #2 for Practicum Exercise: A scientist wants to publish the data associated with the article he is submitting on Acoustic Properties of Salpa thompson to a journal. What steps does he need to take and what information does he have to collect about this data in order do submit this information to the publisher. <blockquote> NOTE: This is a real use case. We have an example of the steps Peter Wiebe took to do this and the products will be available for workshop participants - Peter Weibe's article: Acoustic properties of Salpa thompsoni that Neil Sarkar and Holly created Dublin Core metadata for, with separate metadata for the text and each figure and table. </blockquote> ===template=== [[media:Template for data review 2 Cases 11March2009.pdf|Template for data review]] ===Generic Data Pipeline=== [[Media:dataflow20080603.pdf|An example of a general data pipeline]] ===data=== [[media:CTD085.txt|A link to backbone data for Table 2]] [[media:CTD087.txt|A link to backbone data for Table 2]] [[media:Salp38_1-selection_inner part_2.xls|A link to backbone data for Table 6]] [[media:DSC02225.JPG|A link to backbone data for Figure 3]] [[media:Salp200-1_selection_inner part_2.xls|A link to backbone data for Figure 7]] '''summary''' {{#ask: [[relation::{{PAGENAME}}]] |?category |?description |?download}} *[[JMBL20090410 Example Table 2]] *[[JMBL20090410 Example Figure 3]] ===Supplementary Use Cases=== '''UseCase A''' A paper is to be published in DSR II and the author needs to know how to reference the data that are available online. As the data manager, I need to know whether I need to do anything differently in how the source data are documented and served (additional metadata?, persistent identifiers?). The paper (published 2008 in DSR II): Qian P. Li, Dennis A. Hansell, Nutrient distributions in baroclinic eddies of the oligotrophic North Atlantic and inferred impacts on biology, Deep Sea Research Part II: Topical Studies in Oceanography, Volume 55, Issues 10-13, Mesoscale Physical-Biological-Biogeochemical Linkages in the Open Ocean: Results from the E-FLUX and EDDIES Programs, May-June 2008, Pages 1291-1299, ISSN 0967-0645 DOI: http://dx.doi.org 10.1016/j.dsr2.2008.01.009 URL: http://www.sciencedirect.com/science/article/B6VGC-4SFR7MF-5/2/b08137059737fef3a654b2fd7897d4fb that references data that are available online from BCO-DMO: http://osprey.bco-dmo.org/project.cfm?flag=viewd&id=13 the likely source data objects for the paper are listed below: http://ocb.whoi.edu/jg/serv/OCB/EDDIES/INVENTORY.html1 Measurement PI_name Data object URL OC404-1 bottle (merged) OCB_DMO http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S1/bottle_OC404_S1.html0 bottle oxygen Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S1/oxygen.html1 nM NO3/PO4 Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S1/nuts_low.html0 DOC; DON; DOP Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S1/organic_matter.html0 del15N (PON) Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S1/del15N-PON.html0 WB0409 Niskin bottle samples Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0409/bottle.html0 bottle oxygen Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0409/bottle.html0 DOC; DON; DOP Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0409/organic_matter.html0 del15N (PON) Hansell data not contributed OC404-4 bottle file (base) McGillicuddy http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S2/bottle.html0 bottle oxygen Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S2/oxygen.html1 nM NO3/PO4 Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S2/nuts_low.html0 DOC; DON; DOP Hansell data not contributed del15N (PON) Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC404_S2/del15N-PON.html0 WB0413 Niskin bottle samples Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0413/bottle.html0 bottle oxygen Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0413/bottle.html0 DOC; DON; DOP Hansell data not contributed del15N (PON) Hansell data not contributed OC415-1 bottle file (base) McGillicuddy http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC415_S1/bottle.html1 nM NO3/PO4 Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC415_S1/nanoNutrients.html0 DOC; DON; DOP Hansell data not contributed del15N (PON) Hansell data not contributed WB0506 Niskin bottle samples Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0506/bottle.html0 bottle oxygen Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0506/bottle.html0 DOC; DON; DOP Hansell data not contributed del15N (PON) Hansell data not contributed OC415-2 bottle file (base) Ledwell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC415_T1/bottle.html1 OC415-3 bottle file (base) McGillicuddy http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC415_S2/bottle.html1 nM NO3/PO4 Hansell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC415_S2/nanoNutrients.html0 DOC; DON; DOP Hansell data not contributed del15N (PON) Hansell data not contributed WB0508 Niskin bottle samples Bates http://ocb.whoi.edu/jg/serv/OCB/EDDIES/WB0508/bottle.html0 DOC; DON; DOP Hansell data not contributed del15N (PON) Hansell data not contributed OC415-4 bottle file (base) Ledwell http://ocb.whoi.edu/jg/serv/OCB/EDDIES/OC415_T2/bottle.html1 '''UseCase B''' A scientist has found a sound profile represented as a graph in a paper that he feels justifies a hypothesis he has put forward. He wants to get access to the original sensor data related to that sound profile. How does he do this? '''UseCase C''' A scientist has 10,000 images on slides sitting on his shelf that represents 10 years of work that he wants to digitize. How to get the metadata for data collected in the past before best practices for metadata was considered. Is it even worth the effort? '''UseCase D''' A scientist has written a paper with data that s/he would like to publish but access to the data is restricted or the use of the data is restricted, for some period of time. The publisher has requested that all data represented as figures or tables in this journal be "properly cited" with repository access. '''UseCase E''' A scientist wants to find all the data associated with a specific harmful algal bloom. He is interested both in orginal data and derived data that has been published in articles and deposited. He wants to be able to determine who collected the original data, who analyzed and processed the data. He will then publish a review article that will contain a synthesis of this information. How will he find everything (what metadata, connections, organization will be needed in an 'ideal world')? How will he know who should receive attribution? How will he publish and maintain attribution on his data synthesis product once he publishes it? ==Suggested Preparation Materials for the Meeting== * National Science and Technology Council Releases Strategy for Digital Scientific Data. A view down the middle of a boron nitride nanotube. The National Science and Technology Council (NSTC) released a report describing a strategy to promote preservation and access to digital scientific data. The report, Harnessing the Power of Digital Data for Science and Society, was produced by the NSTC's Committee on Science under the auspices of the Office of Science and Technology Policy (OSTP) in the Executive Office of the President. * The open and timely publication of digital scientific data called for in the report will ... More at http://www.nsf.gov/news/news_summ.jsp?cntn_id=114448&govDel=USNSF_51 * Survey of data provenance techniques. Technical Report IUB-CS-TR618 http://www.cs.usask.ca/faculty/sal426/Provenance/docs/Literature%20Review/TR618.pdf * ICSU Ad Hoc Strategic Committee on Information and Data http://www.icsu.org/Gestion/img/ICSU_DOC_DOWNLOAD/2123_DD_FILE_SCID_Report.pdf * Sudha Ram, Jun Liu. Understanding the Semantics of Data Provenance to Support Active Conceptual Modeling http://en.scientificcommons.org/41046974 * Fox, McGuinness, Pinheiro da Silva. Knowledge Provenance in Virtual Observatories: Applications to Image Data Pipelines, 2008. http://data.semanticweb.org/conference/iswc/2008/paper/poster_demo/70/html * Pinheiro da Silva, McGuinness, McCool. Knowledge Provenance Infrastructure. http://en.scientificcommons.org/685801 * Clifford Lynch. The Shape of the Scientific Article in the Developing Cyberinfrastructure,” CTWatch Quarterly (August 2007) http://www.ctwatch.org/quarterly/articles/2007/08/the-shape-of-the-scientific-article-in-the-developing-cyberinfrastructure/ * Skills, Role & Career Structure of Data Scientists & Curators: Assessment of Current Practices & Future Needs. JISC Report 2008. http://www.jisc.ac.uk/publications/publications/dataskillscareersfinalreport.aspx * Baker, Barton, Peterson, Fox. Informatics and the 2007-2008 Electronic Geophysical Year. EOS, Transactions, American Geophysical Union 89(48) 2008. http://www.agu.org/pubs/crossref/2008/2008EO480001.shtml (subscription) * Gomes, Graybeal and O'Reilly. Data Management Issues in Operational Ocean Observatories. Sea Technology 48(5) p.17-20, 2007 http://www.highbeam.com/doc/1P3-1284688471.html (subscription) * Altman and King. A Proposed Standard for the Scholarly Citation of Quantitative Data http://gking.harvard.edu/files/cite.pdf * Trustworthy Repositories Audit & Certification: Criteria and Checklist http://www.crl.edu/PDF/trac.pdf * SCOR/IODE Workshop on Data Publishing, Oostende, Belgium, 17-19 June 2008. UNESCO, 2008. IOC Workshop Report No. 207. http://www.iode.org/index.php?option=com_oe&task=viewDocumentRecord&docID=2457 * Standards for DATA A Proposed Standard for the Scholarly Citation of Quantitative Data http://gking.harvard.edu/files/cite.pdf * ISO 8000 under development http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=50801 ISO 8000 - A Standard for Data Quality by Grantner, Emily Solving Data Quality Problems Using Data Standards by de Jager, Salomon * ISO 19115 http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=26020 * ISO 19115:2003 defines the conceptual model required for describing geographic information and services. It provides information about the identification, the extent, the quality, the spatial and temporal schema, spatial reference, and distribution of digital geographic data *11179 http://metadata-standards.org/11179/ This standard addresses the semantics and representation of data, and the registration of descriptions of that data. The standard has strong international backing and is freely available. SO/IEC 11179 specifies the kind and quality of metadata necessary to describe data, and it specifies the management and administration of that metadata in a metadata registry (MDR). It applies to the formulation of data representations, concepts, meanings, and relationships between them to be shared among people and machines, independent of the organization that produces the data. It does not apply to the physical representation of data as bits and bytes at the machine level. In ISO/IEC 11179, metadata refers to descriptions of data. ISO/IEC 11179 does not contain a general treatment of metadata. ISO/IEC 11179-1:2004 provides the means for understanding and associating the individual parts of ISO/IEC 11179 and is the foundation for a conceptual understanding of metadata and metadata registries.
Summary:
This is a minor edit Watch this page
Cancel