Semantic eScience Meeting April 6, 2012

Printer-friendly version

Welcome to TitanPad!

General Meeting Information


  • Scribe - Katie!
  • Past Action Items
  • Update
  • discussion topic
    • UIMA discussion
    • ontology annotations - documenting the ontology
    • Nature Linked Data?


  • Eric
  • Patrick
  • Katie
  • Thiru
  • Han
  • Le
  • Peter
  • Jay
  • Joanne
  • Deborah
  • Massimo
  • Amruta

Past Action Items

  • SZ: Send link to distributed Jena queries to Eric (done)
  • ER: UIMA presentation in 2 weeks (or find someone to present) (next time?)
  • Austin: presentation into SeSF working group? Somewhere on Drupal - done - sent to Patrick after meeting

Action Items

  • PW: Post Austin's slides on OODT (DONE)
  • PW: Organize tasks around OODT and Ramada with Thiru and Austin
    • what is the underlying schema, if there is one
    • are there extensions to be able to generate provenance
    • grab some data from HAO (COMP, MLSO, CEDAR) and run it through the systems, compare capabilities, etc...


  • Eric: No discussion of UIMA this week - do we have an idea of how we'd like to use this before we dig into the technology?


  • same old, same old
  • provenance
  • MetPetDB (, BCODMO, VSTO



  • coursework
    • study polymaps for visualizing water bodies by lat/lon
    • develop ontology for citizen scientist extension of the semantic water quality portal
  • ontologly review, documentation, annotations
  • looked briefly at CMAPs, not done yet (VSTO)
  • get s2s work on my laptop


  • met with Linyun yesterday about what he's working on, how could fit into work
  • independent study - informatics and social media
    • DMcG: NIST is interested in a socialmedia based infrastrurcture for projects - keep her updated on this. (has a call this pm)
  • scientific portal generator


  • Looking into OODT - how we can get involved
  • data input, mgt.
  • High Altitude Observatory looking into a new data management system?
  • PW: OODT compared to RAMADDA? Lance Jones is interested in this.
    • maybe Thiru can work on this
    • RAMADDA is a content management system for scientific data (came from UNIDATA).
      • Lead developer is interested in semantics, so we are interested in it from this perspective.


  • VSTO - presentation for opensearch docs using s2s
  • working with Massimo on DPSIR modeling
  • attended S2S interest group.
  • advanced sem tech class - incorporating water's effect on wildlife on semantic water quality portal


  • Looking into UIMA for query parsing. wrote and tested a few sample annotators.
  • Eric: looked into something more lightweight than UIMA? (Apache Tika)
  • T: yes, but no analysis framework there.
  • Evaluating tools for generating HTML documentation for ontology.
    • Eric: was Evan looking into this? (No, Patrick was)
      • LODE?
  • Deborah: might want to consider same thing for VSTO
  • Thiru has looked at 4 tools, incl. Protege plugin. The results of the analysis of the tools are at . The trac ticket has been updated.


  • updates
    • Talked about testbeds, import/export
    • Patric set up XML UI on aquarius (which is what we have on live site)
  • Needs to talk to Heidi to get data


  • VSTO ontology review - Eric, Patrick, Stephan will take next stab, then we should have a meeting to review
    • organization is good, documentation needs work
    • a few things need discussion w/ Peter
  • Connecting water quality portal to USGS, so resource managers can use it to make informed decisions.
    • (Peter: data release is a totally different use case for USGS.)
  • Continuing to work on fungal ontology w/ Nathan (Katie, Han). Should have a tech report up in the next week.


  • 0.7 BCO-DMO is released, just need to tag in subversion
  • 0.8 BCO-DMO will be this month for EGU, mid April, Eric has more
  • 1.0 BCODMO by June/July. Pushed live via "Advanced Search" tab on their production site, linked data, pages generated by triple store (like our web site)
  • VSTO moving along:
    • Eric has more info (see below)
    • Mockup: (still need to get some of the OpenDAP stuff working on this)
  • WIll use same (?) as S2S for guided search.
  • Better documentation, then will start generating pages in VSTO like we do for our website
  • DLM: for documentation?
    • PW: Josh submitted some addtl terms to about datasets.
    • Search engines are starting to use stuff, so we may see positive search results if we use these on our website
    • still need to investigate how terms Josh submitted compares to what we have in VSTO.
  • By end of this month, will have review done, tie into STOM, ....?....


  • S2S weekly interest group meetings (screencasts will be online soon) - great - does that mean the sessions were captured and just not up yet? yes - yeah!
  • S2S 3.0 is in alpha (BCO-DMO will be first test bed)
  • BCO-DMO 0.7 tagged/released, 0.8 coming by EGU?
    • Hierarchical widgets for SeaVoX vocabularies
    • "Context widgets" for BCO-DMO
  • Making push for IOGDS v 1.2 (removed dependence on legacy S2S 1.0)


  • GOEF - framework for ontology evaluation within the context of a specific use case. Working simultaneously on (a) a SADI interface to fit into DataFAQs, started to work out the interface and stubs and (b) development of a formal description for a use case document, which would be needed to be submitted with the ontology. (this is being informed by FUSE requirements as well) Amruta is working on this, it is uncharted territory. Proposed an planning meeting and will be seeking funding to use this meeting as a planning meeting for a workshop.
  • Contributing to the Water Portal Extensions for USGS Resource Managment
  • Note to Amruta: The BioPortal has a text box where it parses text and makes recommendation for onotlogies based on the terms in the text. (just something to look at and think about / try).


  • Reviewed ontology documentation
  • Almost completed ontology review for vsto_all.owl
  • Attending the S2S interest for group. This might be helpful for the UI for GOEF
  • Checking the wiki pages for (reviewing them).
  • GOEF
    • Working on formalizing use case documents
      • Broke down the VSTO-CEDAR use case in Function, Standard and Component (Problem faced determing the Standard component). Joanne suggested we can refer the

Minimum Infomation model vocabulary ( for getting information on the minimum information we need for formalizing use cases - and as an example of the "standard" level

    • Developed UI in HTML
    • SADI Services from DataFAQ's: Will touch the topic once the use cases are formalized
  • Working on finalizing use cases for X-Informatics project. Joanne suggested using the pharmacogenic datset
  • X-Infomatics Project
  • Will be meeting Bhardwaj today for knowledge sharing on my current tasks

Next Steps for Next Meeting

  • Make a presentation for GOEF:
  • Points to cover in the presentation
    • Go over frmaework
    • Current Status and
    • Approach followed
    • Problems faced


  • Workforce development for data science panel at Research Data Access and Preservation (RDAP) (New Orleans) -
  • Planet Under Pressure (PUP) conf and workshop on Future Oceans (Marine Ecosystems) -
  • Workshop at London School of Economics on state of oceans


  • Attended Eric's S2S presentation.
  • Possibility to mentor a Google summer of code idea on OMAR (OSSIM Mapping Archive) framework for image and video database and online image processing
    • google summer of code project - Adding semantic search to OMAR using S2S engine
  • OMAR - OSSIM mapping archive (Open Source Software Image Mapping)
  • Massimo and Eric will keep in contact.
  • link to OMAR description : -


  • Worked with Simon Cox to resolve a redirection issue with his published O&M ontologies
    • resolved, can now proceed with STOM update to use Simon's ontologies
  • Final push to deploy Quality Label component of MDSA RESTful API
    • implemented and passing tests, but VERY SLOW
    • deployed on aquarius, having issues, will be demoing Monday morning (10-11 AM ET) in project final review meeting


  • Eric will switch IOGDS dependency to gemini
  • Take down Virtuoso
    • Talk to Xian about Orgpedia
    • Talk to Alvaro about his demos (Nature)

Unstructured Information Management Architecture (UIMA)

  • framework with text analytics tools
  • standard natural language processsing - entity extraction, etc.
  • Eric: so, we want it just for short query processing? Entity extraction to make a more structured query? We are talking about this in our discovery meetings, don't need to get into too much detail now.
  • DLM: most of these tools were aimed at looking at abstracts, longer docs, not necessarily queries, but they may still work well.
  • Eric: Watson uses UIMA for question (answer) breakdown as well as text corpus analysis?
  • DLM: yes, but answers are structured pretty predictably.


  • NEON non-specialist use case.
  • got feedback from James Wilson, started information model.
  • need to keep working with James, Brian, Heather.
  • Joanne: timeline?
    • PF: project finishes at end of August - want first iteration of dashboard interface by then.
  • Joanne: are we concerned about a time crunch? How long does info modeling take?
    • PF: No, but need to discuss expectations with domain experts. Nobody's working fulltime on info modeling.
      • MDSA extension runs through May? (Stephan thinks so.)
  • Output for information model is concept map (represented in XML)
  • DLM - had talked about using this in semantic e-science in the fall.
    • PF: but it would be done already, so it might be better to use one of their other use cases. But NEON people are pretty slammed - not as much time to serve as domain experts in class projects.

Discussion topics for next week

  • Eric: working discussion topic? we've done 2 tech eval topics.
  • DLM: group discussion on ontology evaluation (?)
  • Joanne: context based use case dev on ontology eval?
  • PF: Amruta's work?