MIDTERM
From Web Science 2009 class
Midterm Coverage/Review sheet:
Preparing for the midterm
Believe it or not, we've actually covered a lot and, I hope, you've learned a lot about the infrastructure of the World Wide Web. Here's what I believe you should know based on the material we've gone over in class (and if you are comfortable with these topics, you should do well on the midterm):
1 - URIs and HTTP headers in Detail
The material covered in the slides at
http://www.cs.rpi.edu/academics/courses/spring08/websci/URI-HTTP-details.pdf
is all fair game
esp. URIs vs. URLs
The parts of a URL
HTTP response types (by class, not by number)
The HTTP request methods
HTTP headers
2 - Web architecture in practice
What is a three-tiered web app (dynamic content server) and how is it accessed.
3 - Crawling
Polite crawlers Robots.txt loop/dup elimination (definition, simple explanations) crawler traps
4 - page rank and search
The page rank formula (and being able to apply it) Relationship between crawler and search engine Reverse Index and what it comprises (pages, words, word locations, special features)
5 – Recommender Systems
The basic idea behind recommenders User vs. item based recommendations Computing a recommendation score (don't need to memorize the formula, but understand what it represents)
I won't promise that I won't add something not directly on this list, but I repeat, if you're comfortable with this material, you should do fine.
