MBVL CyberSEES Kick-Off Meeting

Printer-friendly version

10 December 2015, All-Hands Project Kick-Off WebEx

Attending: Peter Fox, Heidi Sosik, David Mark Welch, Stace Beaulieu, Joe Futrelle, Stephan Zednik, Benno Lee

Communication Plan

  • Monthly All-Hands telecon; request set up mid-Jan thru mid-May.
  • Stace proposes technical small group Hackpad every 2 weeks (after 4 Jan.), Stephan Denver time zone, including Andy V. Stace ACTION initial mtg with Andy V. on 17 Dec., start w Skype to help Stephan and Benno get up-to-speed (after 4 Jan.) (Stace will also invite to a tech discussion Han Wang from RPI who worked with MBL b4 including VAMPS in 2014)
  • Stace set up webpage for Meeting Notes on TWC MBVL website; Stace ACTION show Joe, Heidi, David, Andy how to navigate/use MBVL Drupal site
  • In Person Mtg Woods Hole: Peter could come down to WH on Tues afternoon, meet Weds/Thurs (he teaches Tues and Fri) [NOT Feb 24/25 b/c Ocean Sci] [aim for March 2016]

Work Plan

Peter’s ACTION incorporate today’s discussion into draft Project Plan and then send out to co-PIs


  • Decision for 1st yr project plan: focus on objectives 1,2,4 right away, keep 6 in mind, 3 and 5 later
  • Obj 1 Data access; Joe is Point Of Contact POC; Stephan will be tech lead on RPI end.

Joe and Stephan to connect on computational resources (e.g., mentioned VAGRANT);
need to start working with VAMPS data; Heidi: Kristin Hunter-Cevara has some high-thruput for proks in VAMPS [collab w Anton Post], plus Emily Brownlee’s high thru-put euks that could be in VAMPS.

  • Obj 2 Data products; Stace is POC, and POC for DMP; Benno at RPI.

data products is super-set containing indicators
Benno has experience with derived data products

  • obj 3: did not i.d. POC yet

start with use cases for the indicators themselves, then use cases for models. keep modeling in mind as we decide use cases to choose biodiversity indicators to focus on

  • Obj 4 Traceable workflows; Peter is POC; Stace at WHOI;

everyone needs to contribute e.g., knowledge of dependencies.
Joe current workflow IFCB does not have standardized provenence b/c have no end-user case (yet) for dealing with that provenance. Heidi has an end-user need to better track the versioning of data processing in her workflow. Stace gave a visual example for impact of provenance ("fireworks" PROV-ES diagram). Heidi wants provenance to create new workflows based on previous, traceable workflows. Peter asks what question would you ask of that provenance. If I have an image and it has a particular classification accdg to what classifier was applied, what features were input into that classifier. Would help if she could do more than 1 classifier at the same time. Cant run both live right now. Without provenance might construct composite indicators incorrectly (mixing apples oranges).
Peter comment on internal vs external provenance.

  • objective 5 did not i.d. POC yet

can't start yet explicitly, but keep in mind, similar to objective 3.
e.g. GCIS (Stace ACTION should show GCIS to Joe, Heidi, David)
Peter ACTION to write his vision of what he would like for a knowledge base for this project. Joe linked to external provenance of concepts developed in this project (Stace ACTION check into this again, existing vocabs for biodiversity indicators)

  • Obj 6 Broader impacts; POC All Hands

Peter's Data Science course at RPI will record some of his classes this semester. Joe and Stace ACTION ITEM will ‘sit in’ as possible on this semester’s course. Heidi says 3 million annotated IFCB images for student projects.
Sounds like Benno’s thesis involves data and software versioning, and this could be a case study in his last year of his PhD research; Heidi suggests carving out a component that could be part of his thesis.

At Woods Hole
Stace ACTION write up a research opportunity to help Peter/Benno recruit an RPI undergrad to apply to SSF; can Peter advertise in his Data Science Course?. Stace ACTION check into PEP deadlines/ could share the blurb that we write for RPI. Stace ACTION share with David, too, for MBL undergrad opps. Heidi/Stace also can consider David’s STAMPS b/c ~1hr lectures.

STACE put up a list of opportunities for dissemination (click on MBVL Calendar link), e.g., MBON, GEOBON, IMBER, LTER, ??Coastal Zone (had an ECO-OP session few years ago), perhaps consider session around theme of a project


Peter and Joe will work together to scope innovations in Response to PM’s / can incorporate some of Peter’s work with data analytics in his Data Science Course.

Agenda from email 3 December 2015

1. Revisit project objectives and innovations, especially to determine priorities for Year 1.
2. Discuss aspects of the project plan in relation to how many tasks there are, who is nominal lead and who participates in the task, and then what resources are needed.
3. Review our Collaboration Plan and decide how often to "meet" (whole group, and task groups), including all-hands.
4. Obtain preliminary agreement on work in the next 3-6 months.
5. Identify dissemination opportunities (conferences, workshops) where we can begin to engage communities.
6. Logistics; email list, project pages, computational infra, software/ tools, places for prototypes/ demos, etc.
7. Review action items.