Advanced Semantic Technologies

Printer-friendly versionSend by email


Professors: Deborah L. McGuinnessTeaching Assistants: Justin Karpenski
Topics: Semantic Web Services, Semantic Web
Course Numbers:
  • CSCI 4967-01
  • CSCI 6965-01
  • ITWS 4963-01
  • ITWS 6962-01

Co-Instructor: Patrice Seyed

Meeting times: Tuesdays 1-3:50, January 22, 2013 - May 17, 2013
TA Office Hours: Thursdays 2-4, Winslow 1140

Class Location: Winslow 1140

Description:
This course aims at showing the cutting edge research on semantic web and encouraging research capability for advanced students. Students attending this course should expect reading, presenting and evaluating important research papers on semantic web, identifying and surveying interesting semantic web research areas.
Academic Integrity:
Student-teacher relationships are built on trust. For example, students must trust that teachers have made appropriate decisions about the structure and content of the courses they teach, and teachers must trust that the assignments that students turn in are their own. Acts, which violate this trust, undermine the educational process. The Rensselaer Handbook of Student Rights and Responsibilities defines various forms of Academic Dishonesty and you should make yourself familiar with these. In this class, all assignments that are turned in for a grade must represent the student’s own work. In cases where help was received, or teamwork was allowed, a notation on the assignment should indicate your collaboration. Submission of any assignment that is in violation of this policy will result in a penalty. If found in violation of the academic dishonesty policy, students may be subject to two types of penalties. The instructor administers an academic (grade) penalty, and the student may also enter the Institute judicial process and be subject to such additional sanctions as: warning, probation, suspension, expulsion, and alternative actions as defined in the current Handbook of Student Rights and Responsibilities. If you have any question concerning this policy before submitting an assignment, please ask for clarification.

==Attendance Policy==

Enrolled students may miss at most one class without permission of instructor.
Once one class has been missed (with or without permission) no additional classes may be missed without permission.

==Grading Policy==
Grades will be determined based on homework assignments along with class participation. Late assignments will drop 10% of the possible value of the assignment for each day late.

==Schedule==

* Week 1: January 22, 2013
* Week 2: January 29, 2013
* Week 3: February 5, 2013
* Week 4: February 12, 2013
* Week 5: February 19, 2013 today follows a monday schedule so there is no class
* Week 6: February 26, 2013 (Patrice)
* Week 7: March 5, 2013
* March 12 - Spring break
* Week 8: March 19, 2013
* Week 9: March 26, 2013 (Patrice)
* Week 10: April 2, 2013
* Week 11: April 9, 2013
* April 16 - student class
* Week 12: April 23, 2013
* Week 13: April 30, 2012 (Patrice?)
* Week 14: May 7, 2012

==Weekly detail==

===Week 1 Jan 22, 2013===
Class 1 - Advanced Semantic Technologies 2012 Lecture 1 [Download]
Notes - Jan-22 Google Document
* Introduction to course
* BrainStorming session about what knowledge is needed to "power" a semantic application

Assignment 1 - 5 points:
Read social machine and wine agent paper.
Social machine paper to read:
Jim Hendler, Tim Berners-Lee: From the Semantic Web to social machines: A research challenge for AI on the World Wide Web. Artif. Intell. 174(2): 156-161 (2010).
http://www.stanford.edu/class/cs227/Readings/hendler-berners-lee-semanti...

Wine agent project page:
http://tw.rpi.edu/web/project/Wineagent/Publications
Wine Agent Paper to Read:
Evan W. Patton and Deborah L. McGuinness. "The Mobile Wine Agent: Pairing Wine with the Social Semantic Web." the 2nd Social Data on the Web workshop, Washington, DC, USA. 2009.
http://wineagent.tw.rpi.edu/papers/sdow-2009-wineagent.pdf

After reading the first paper, describe a new social machine example? (approx. 1 page)
After reading the second paper, how was the domain characterized differently from our class session? (approx. 1 page)
Naming conventions on all assignments, please email to the ta and the professor with the assignment.

Come prepared with a 1-page description of a research topic that you could imagine using semantic technologies and be prepared to talk about it for 10 minutes

Please make sure to name the assignment in the subject line:
AST [assignmentNumber] - [yourLastName]
for example
AST1-McGuinness would be Deborah's first assignment for Advanced Semantic Technology.
All assignments are due before class and should be emailed to the professor ( dlm at cs dot rpi dot edu) and the TA ( karpej2 at rpi dot edu )

===Week 2: January 29, 2013===
Notes - Jan 29 Google Document
*Students will present their completed assignment on the social machines paper and wine agent paper, in Powerpoint slides.
* Introduction to SemantEco.
SemantEco3.ppt [Download]

Assignment 2 (5 points):
Read SemantEco IEEE paper.
Ping Wang, Linyun Fu, Evan W. Patton, Deborah L. McGuinness, Joshua Dein, and Sky Bristol. "Towards Semantically-enabled Exploration and Analysis of Environmental Ecosystems." E-Science, 2012 IEEE 8th International Conference .
(http://tw.rpi.edu/media/2012/11/21/a8ce/SemantEcoEscience2012.pdf)
After reading the paper, describe how SemantEco might be improved using semantic technologies (approximately 1-2 pages). Be prepared to present your proposed improvement to SemantEco using one to two powerpoint slides in class. Remember to use the naming conventions when turning in homework.

===Week 3: February 5, 2013===
Class 3 - Advanced Semantic Technologies Class Lecture
Notes - Feb 05 Google Document
*Students will present their completed assignment on improving SemantEco.
Foundations of Knowledge Representation
Readings:
R. Davis, H. Shrobe, and P. Szolovits. What is a Knowledge Representation? AI Magazine, 14(1):17-33, 1993.
http://groups.csail.mit.edu/medg/ftp/psz/k-rep.html
Introduction to KRR [Download]

"What Are Ontologies, and Why Do We Need Them?", B. Chandrasekaran and John R. Josephson, Ohio State University
V. Richard Benjamins, University of Amsterdam
http://www.csee.umbc.edu/courses/771/papers/chandrasekaranetal99.pdf
"Ontology Development 101: A Guide to Creating Your First Ontology
Natalya F. Noy and Deborah L. McGuinness"
http://bmir.stanford.edu/file_asset/index.php/108/SMI-2001-0880.pdf

Vinay K. Chaudhri, Bert Bredeweg, Richard Fikes, Sheila A. McIlraith, Michael P. Wellman: A Categorization of KR&R Methods for Requirement Analysis of a Query Answering Knowledge Base. FOIS 2010: 158-17;
http://web.eecs.umich.edu/srg/?page_id=716 (see the link at the bottom of this page)

WHER resources:
www.wher.org

The link to the information page with the fact sheet, video demo etc is:
http://www.whmn.org/wher/pages/about

Assignment 3 (10 points)
Consider a simplified version of the question in Section 2.4 from the Chaudhri et al paper, what would need to be modeled to answer this question?
(Lead us through a chain of reasoning in English, using whatever background knowledge you need, in order to answer this question.
What are similar questions you can answer with the chain of reasoning and background knowledge?
Remember to turn in homework before class and to use the naming convention on all homework assignments.

Also make a proposal about how to leverage WHER in semantEco

===Week 4 - February 12, 2013===
* Begin with Student presentations
* Discussion of ontology modeling and reasoning/querying
Assignment: Based on the class discussion, extend the ontology provided (will be discussed in class).
* Demonstration of Wildlife Heath Event Reporter
Notes - Feb 12 Google Document

Homework Assignment 4
10 points
Due February 26 by noon eastern time

Given the discussion of modeling relationships between health effects
and chemical discussed in class, including modeling issues and
competency questions, pick some questions you want to be able to
answer, and model the information required to answer those questions using Protege

Pick at least two questions and make sure your ontology includes at
least 3 classes and 3 properties. Describe how the information in the
ontology is used to answer the two questions. For each class and
property you model in the ontology, include a comment describing the
members of the class, and the relationships, respectively. Identify where inference is
being used. Please also choose questions that require some inference - do not just
choose two questions that can be answered by straight lookup.

Also describe the provenance, for example, where a definition of a
chemical or health effect was taken from. Additionally, describe how
you would model a hazard vs. risk, and how you can model risk of
chemical-caused health effects with respect to a causal chain of
events. For extra credit, attempt to model this hazard vs risk issue
and the causal chains in OWL.

Show and describe how you model ternary relationships, inspired by
the approaches given at:
http://www.w3.org/TR/swbp-n-aryRelations/

Show how the inferred hierarchy has additional knowledge that the
asserted class hierarchy does not have.

In the following we include sources you can consult for this work,
although you can apply other sources:

1.There are vast storehouses of chemical effects at, e.g.,
http://toxnet.nlm.nih.gov . You can download all ToxNet's info, but it
will be in text files that needs to be parsed.

2. A list of ATSDR (Agency for Toxic Substances and Disease Registry)
Minimal Risk Levels (MRLs) at http://www.atsdr.cdc.gov/mrls/index.asp
(Minimal Risk Levels). This will show you which endpoints are most
sensitive for the chemicals there, but it doesn't show all the organs
affected. EPA has similar data available through their download tool
at www.epa.gov/iris.

3. SPL Resource Page at http://www.atsdr.cdc.gov/spl/resources is a
link for the "SPL Toxicity Values". These don't show organs, but do
show how much certain types of endpoints are affected by chemicals
(Aquatic Toxicity, Mammalian Toxicity, etc.). If you are interested in
that, you'll also want to see the "Methodology for
Toxicity/Environmental Scores" on that page.

4. A lot of tox data in Tox Profiles
(http://www.atsdr.cdc.gov/toxprofiles/index.asp), but it is not in an
easily transportable data format."

===Week 5 - February 19, 2013===
No class - Tuesday follows a Monday Schedule
Please continue your modeling work. If you have not used Protege before, please make sure to explore using it.

=== Week 6 February 26, 2013===
In class modeling session and introduction to the foundations of semantic tools
Notes - Feb 26 Google Document

Assignment 5 (10 points)
1. Write-up results/summary of dialogue in class from your presentation and any next steps.

2. Results/reports and meetings with Patrice w/ proposed next steps, timelines,
and progress. Show one way and example you are using semantics or hope to use semantics.

=== Week 7 March 5, 2013===
Project Use Case Discussions
Notes - Mar 5 Google Document

=== Spring Break - week of March 12 ===

=== Week 8: March 19, 2013 ===
Class readings determined by projects and presentations
Notes - Mar 19 Google Document

From now until the end of the semester the weekly project status
reports we have been asking for is due at 5:00pm every Friday. This is
part of your grade; points will be deducted if the deadline is missed.

The weekly status report should be posted in the wiki of your
respective github project spaces, and the content is to be pasted in
an email to myself, Deborah, and Justin, where at the top of the email
you also provide the immediate hyperlink to the wiki page you posted
to. If you have any issues posting to your wiki please let me know.
Both posting your report to the wiki and placing this content in the
email is part of this grade.

As we indicated in class, this report includes what feedback you
received in your most recent class presentation, the next steps for
your project based on the feedback, also next steps and feedback
covered in one-on-one meetings with me.

(Keep this in mind as you schedule meetings with me, scheduling a
Friday meeting does not provide you additional time to submit.)

=== Week 9: March 26, 2013 (Patrice)===
Class readings determined by projects and presentations
Notes - Mar 26 Google Document

=== Week 10: April 2, 2013===
Class readings determined by projects and presentations
Notes - Apr 2 Google Document

=== Week 11: April 9, 2013 ===
Class readings determined by projects and presentations
Notes - Apr 9 Google Document

For Friday, please submit the first draft of your use case using the use case template -
[Download]. Make sure to explain all sources and provide descriptions of
each. It's okay to leave boxes blank if they do not apply, or to add any
fields for information that is important to the use case. Also make sure to
point out how semantics and provenance are being used. The use of semantics
and provenance will be at least 30% of your grade.

By Monday evening of next week, read each others use case documents and provide
feedback to improve each and increase clarity. Feedback is to be provided by
sending an email to the class that includes at least 3 concrete suggestions for improvement
for each of the other use case documents. A second draft is due that incorporates
your use case's feedback on Wednesday. Note, this is in parallel with the actual
work you are doing. For Zach, the same deadlines apply, for a draft report instead
of a use case, on the work you have done on your entity recognition project.

=== April 16 - Students meet to present revised use cases to each other. ===
Notes - Apr 16 Google Document

===Week 12 - April 23, 2013===
Project Presentations
Notes - Apr 23 Google Document

===Week 13 - April 30, 2013===
Project Presentations
Notes - Apr 30 Google Document

===Week 14 - May 7, 2013===
Project Presentations
Notes - May 7 Google Document

======================================
Student Demo Pages

SemantEco Health Facet: http://tw.rpi.edu/web/SemantEco-Health-Facet
Entity Disambiguation: http://tw.rpi.edu/web/EntityDisambiguation
SemantEco SemantGeo Extension: http://tw.rpi.edu/web/SemantEco-SemantGeo

======================================
Resources for relevant semantic web papers
* Journals we will examine
** Journal of Web Semantics: http://www.elsevier.com/wps/find/journaldescription.cws_home/671322/desc...
** Journal of AI Research: http://www.jair.org/
** International Journal On Semantic Web and Information Systems: http://www.ijswis.org/
** International Journal of Semantic Computin g (IJSC): http://www.worldscinet.com/ijsc/
** International Journal of Metadata, Semantics and Ontologies (IJMSO): http://www.inderscience.com/browse/index.php?journalCODE=ijmso
** Journal of Data Semantics: http://lbdwww.epfl.ch/e/Springer/
** Journal of Semantics: http://jos.oxfordjournals.org/

* Nature of articles - see the following conferences for additional topic/ subject areas
** http://iswc2012.semanticweb.org/
** http://www.eswc2012.org/
** http://www.aaai.org/Conferences/
** http://semtechbizsf2012.semanticweb.com/

==================
References from some past projects from this class in previous years.
Some projects:
* Mushroom Identification: http://bit.ly/wJ2SGk
* Water Quality Portal: http://tinyurl.com/6oxop3q
* Air Quality Portal: http://tw.rpi.edu/web/Courses/AdvancedSemanticTechnologies/2012 (Look for Linyun)
*Water Quality Monitoring: http://www.kirkjalbert.com/about-me/, http://bit.ly/z3Kt61, http://www.watershed-mapping.rpi.edu
*Linked Sensor Data and Data Quality: http://bit.ly/zQ8Ile, http://logd.tw.rpi.edu/demo/trends_in_smoking_prevalence_tobacco_policy_...

Supplemental Readings:
----------------------------
Ontologies and the Semantic Web (Ian Horrocks)
http://www.cs.ox.ac.uk/people/ian.horrocks/Publications/download/2008/Ho...

A Description Logic Primer (Krötzsch et al.)
http://arxiv.org/pdf/1201.4089

RDF Primer
http://www.w3.org/TR/2004/REC-rdf-primer-20040210/

RDF Primer - Turtle Version
http://www.w3.org/2007/02/turtle/primer/

Web Ontology Language (OWL) Guide
http://www.w3.org/TR/owl-guide/

Turtle - Terse RDF Triple Language
http://www.w3.org/TeamSubmission/turtle/

Peter Suber's Logic "Translation Tips"
http://legacy.earlham.edu/~peters/courses/log/transtip.htm

CSV2RDF4LOD (for creating RDF data from tabular formats):
https://github.com/timrdf/csv2rdf4lod-automation/wiki