Provenance Meeting 100901

From Tetherless World Wiki

Jump to: navigation, search

Contents

Provenance Agenda

previous meeting

  • Aerostat - update

next meeting

Attendance

Past Action Items

  • James to post current slides (pdf), or link to (UNKNOWN)

New Action Items

  • SZ: Add new use cases for SPCDIS to the Use Case page on the wiki
  • SZ: Also make sure the use cases on that page still make sense, need to be removed
  • SZ & PW: new vsto sparql use cases (or use cases that otherwise require a persistent triple-store)
  • SZ & DLM: observer log comment hierarchy discussion with deborah
  • SZ: organize new observer comment query use cases and requirements for a student activity
  • SZ: put all my new observer comment pml on escience
  • James Michaelis: to post current slides (pdf), or link to
  • SZ, DLM, James: Review comment hierarchy
  • Patrick needs to push new version of VSTO to incorporate Cynthia's changes
  • Patrick and Cynthia will work to open up the port on tw1 for at least access within VPN

Meeting Notes

Fun with technology

  • MediaWiki for meeting information
  • Dimdim for telecon
  • titanpad

Introductions

  • Mandeep - Masters Student
  • Arun - Working with Professor Peter
  • Joanne - Research Assoc Professor - http://tw.rpi.edu/wiki/Joanne_S._Luciano
  • Deborah McGuinness
  • Peter Fox - Lead on 4 science provenance projects (on wiki page)
  • Jim McCusker - 2nd year, bioinformatics doing cancer research, national cancer institute work, trying to get a consistent representation of provenance, experimantal artifacts, history of data
  • Cynthia Chang - Research Staff, Inference Web, PML, Provenance

This is our bi-weekly meeting going over science provenance activity

Transparency and Trust

  • Also fitness of use
  • expose assumptions, caveats, etc. in processing

Fragmentation, Disconnection, Encapsulation bad for transparency

information/data models describe a focused domain

  • used in isolation the models expresivity is limited
  • information models often have some overlap
  • construct an integrated information model that utilizes multiple specialized domain models
  • integrated information model more expressive then the sum of its parts

Spectrum of a provenance ecosystem

  • explanation
  • justification
  • verifiability
  • proof
  • trust

All can be considered part of the all encompassing notion of 'Transparency'

SPCDIS Update

  • Stephan and James met with MLSO scientists
  • James went over work on role based presentation of information
    • visualization is based on experties of the user, how much they want to see
  • Stephan has scripts that generate PML from 5 years of observer logs
  • Two different justifications for each
    • this information was taken from the observer log
    • the user made this comment at a certain time
  • mockup at http://tw.rpi.edu/portal/SPCDIS_Workgroup_-_Visualization_and_User_Interaction on the bottom
  • Stephan has comments encoded in PML.
  • Looking at putting (the PMLJ) in a triple store
  • A few new use cases. find all comments that mention a particular text string -
  • give me all the pics comments that mention a particular text string
  • todo - deborah might review the comment hierarchy with stephan / james
  • Two use cases
    • - "Give me all the MkIV problem comments in 2001 which mention 'tophat'"
    • - "Give me all the PICS comments in 2001 that mention 'occulter'"
  • Couple more use cases to think about:
    • --when did the occulter change? (stephan calls this a "base" use case
    • -- what occulter was used during this image's observation? (when was the last occulter change event? what was the occulter changed to?)
    • nice example in the comment log that show a correction comment (that is disconnected - example - "The two reports above shold be north WEST not north east as reported

CSIRO

  • W3C Sensor ontology (http://www.w3.org/2005/Incubator/ssn/)
  • Developing a water sensor network in Tasmania
  • 6-7 people
  • 1 year intensive project
  • Patrick and Stephan to go there for 2 weeks sometime in October
  • Currently missing the provenance information
  • They have WaterML and SensorML

New work with MDSA

  • Introducing more science terminology
  • New types of information that can be exposed to the user
  • More information that can be expressed in the information model

DQSS

  • Went through a release of the data quality screening mechanism
  • Last couple weeks we've been trying out the web interface
  • Users want data screened to a particular level
  • Wanting to bring in more domain knowledge in the provenance information. Group information based on certain concepts or parameters in the science domain
  • Looking in to using SWEET
  • http://mirador-ts1.gsfc.nasa.gov/
    • Only a couple dataproducts utilize the semantic information - AIRX2RET, AIRH2RET and AIRS2RET, and only for Version 5 of those products (not version 3).
  • Will be expanding to include more data products
    • Not yet using PML
    • Chance to generate a ton of provenance information

Triple Stores

  • Provenance
    • PMLp triple stores upgraded to new version of ???, loaded in, and can do a search just fine
    • Looking to get PMLp that Stephan is generating to load into triple store
    • Modifying the PML API to work against the triple stores
    • Yes ... eventually
      • And will be able to
    • Might want to get a student to work on that, to put all of the PML into the triple store
  • eScience
    • Putting VSTO knowledge base into triple store as well and testing
    • Action Item: SZ - generate research VSTO use cases

AeroStat

  • Sharing mindmap from Greg L. of GSFC
  • Categorize the infomration and match it up with elements in the ontology
  • Some might be internal provenance, other information external provenance

AGU

  • go ahead and work on abstract presented at IPAW
  • James can do both abstracts for both his ideas, submit both, we'll find another first author
  • Abstract information should go to http://tw.rpi.edu/portal/AGUFall10_Abstracts
Personal tools