Meeting Notes September 23
From Semantic Portal Wiki
ACTION Shangguan to look into wireless setup in lab.
Dominic: Wiki page for meetings. Work in progress, any help, questions, design changes would be welcome. ... http://tw.rpi.edu/portal/TW_Group_Meeting ... No set schedule today, so open floor to anyone with announcements.
|
Jim: Homework assignment for everybody. data.gov has hookup of SPARQL queries to Google visualizer. (Li will post pointer to page.) ... Everyone will find some RDF, "SPARQL it", and create a Google visualization. ... Create a web page for it, and add a link to a TBA wiki page. |
Peter: Some people had asked how to get access to TW wiki.
Jim: Everyone should now have an account. Blog and wiki should share password. ... On internal wiki, we need current status of all students. Well defined for CS students (quals, research quals, etc.). ... Need volunteer to set up wiki template.
ACTION: Evan to set up wiki template for student status.
Jim: Other departments (cogsci) have different requirements. ... Please keep your information up to date. Email professors and update wiki.
Dominic: Any other agenda items?
Peter: Building update. Unexpected remodelling of doorway to large downstairs office. ... Told beginning of next week all extra desks should be taken out of meeting room. ... By next week, HVAC, power, network issues should be resolved for small downstairs office. ... Questions, come talk to me. ... Request sent to parking office for campus shuttle issues. No repsonse yet.
Shangguan: Anyone who can't connect to wifi, send me the error messages you see.
Jim: Often problems manifest as network not working, but no error message. ... Want us to get 3 "good" routers.
Stephan presents of recent provenance work on data science systems. ... Extending into science domain, workflow, provenance. ... NASA Giovani project. Moved plain XML data into PML. ... "Semantic diff" to hilight interesting differences in inputs to system (like mismatched units of measure).
Li: Recently working on visualizing semantic diffs of PML graphs.
Peter: More interesting examples involve provenance graphs that have very different sub-workflows. How to compare them?
Alvaro: How to compare things that are very similar? (Primary school and high school in Chile vs. elementary, middle, and high schools in US).
Stephan: We express rules on things that are incomparable.
Peter: Example: monthly time series. Two different data products you can start with. Similar procedures, very different results. ... One has been avaraged before data is recieved, the other is averaged in the pipeline. Averaging is not the same, but not obvious to user.
Stephan: Warning can be produced on incompatible data, or things that may not make sense.
Li briefly shows wiki page related to SPARQL homework. ... need account on wiki to see page. Li back to comparing PML graphs. ... Final justification depends on two antecedants. Highlighted node has two alternative justifications. Li shows much larger graph with same pattern of alternatives. ... One way to evaluate is to count number of nodes in alternative branches. ... If after accounting one is smaller, that's an improvement. ... Problem can be abstracted as a graph difference problem. ... In semweb terms, blank nodes can cause problems. How can we match blank nodes in two different graphs (or alternative subgraphs)? ... Similar to variable renaming in logic. ... Currently use assumption that everything has a URI. Then can simply compute difference between triples.
Peter: What is the use case here? Where does the PML come from?
Li: "Agatha problem". Comparing proofs from different reasoners (each has different approach to generating proof). ... Same FOL langauge, conclusions, but justifications are different. ... Can stitch fragments together and synthesize a smaller proof than any of the reasoners by itself.
Peter: Proof tree, then? Different than a provenance/lineage graph.
Li: Maybe not different, but a special case.
Peter: But you have inference rules for each step. ... Who consumes this graph?
Li: Logicians will consume the data. Really hard to read. Requires understanding of FOL syntax. ... Proof generator can understand more fully the whole graph.
Patrick: This isn't the same as a provenance chain?
Peter: We don't have inference rules for each node in the chain. ... Part of explanation, some of these nodes have a URI, right?
Li: Each represents an instance of data or process/event.
Josh: When matching nodes between graphs, you're matching data nodes, not event nodes.
Li: Another example, data.gov RSS feed. We want to see what part of the data graph has been changed and emit RSS. ... RSS item details include what instances have been created, deleted, updated. ... Similar to synchronizing RDF graphs. ... Research question remaining: How can I compare two nodes if they don't have URIs? Speculate that they are the same?
Josh: rdfdelta project?
Li: Yes, 2004 project from timbl, Jeremey Carrol. 90% of RDF graphs can be normalized and signed. ... recently updated. ... Additional work from DERI. P2P system RDFSync. How can only a delta be sent across network?
Stephan: Any questions?
Josh: What tools?
Stephan: Currently design phase. No tools yet for diffs. Generation using PML tools, Jena API, IWBrowse.
Ankesh: Have I understood? (question about comparing two artifacts in a system)
Stephan: Want advisor to alert you about issues with comparison. Provenance, processing, domain knowledge. ... Current tools can't scale.
Li: Can be brought back to graph isomorphism problem.
Jim: Current work OPM vs PML. James?
James: did work in May on provenance challenge. Expresivity of OPM. 15 different research groups would emulate system, emit OPM. ... Others take OPM and try to establish that OPM could capture provenance from different tools. ... Task was to find if OPM could capture things PML couldn't. ... Working on how to map PML to OPM. Only able to answer part of relevant queries.
Jim: Are you working to extend it? Is it a solved problem?
James: Still working on it, but only part time as a side project. Not my main research focus.
Jim: Not good that we use a minority language comapred to the much larger OPM. Are there benefits to PML?
Stephan: Working on a comparison of the two. James and Stephan will meet about this work.
Peter: In putting together demos of provenance examples, good for CS people, but visualization is a non-starter for scientists. ... PML has language constructs and extensibility (compared to OPM) to handle scalability and proper UI.
James: OPM is considering extending model with dublin core. That could be similar to expresivity of PML.
Stephan: Both are general templates to model concepts. Working with system that builds on top of these things with extra knowledge that would never be in either.
Jim: Before adjourning... Jim shows new logo for website.
Peter: Everything must go from upstairs. Move it today.

