Class Listing: ITWS 4600/ITWS 6600/ MATP 4450/ CSCI 4600/ MGMT 4962/ MGMT 6600/ BCBP 4960
Course Numbers: 77982, 78861, 79589, 78862, 79427, 80323, 80328, 80380
Instructor: Thilanka Munasinghe - munast at rpi dot edu
TA: Ananya Upadhyay - upadha3 at rpi dot edu
Meeting times:
Section1:Time/Location: In-person - Tue/Fri: Time: 10:00am ET - 11:50am ET ; Location: Lally 104
Section2:Time/Location: In-person – Tue/Fri: Time: 2:00 pm ET – 3:50pm ET ; Location: Lally 104
Instructor Office Hours: Tuesdays/Fridays 12:30 pm ET – 1:30 pm ET or by appointment via email
Instructor Office Location: Lally 315
TA Office Hours: See LMS for TA office hours information
TA Office Hours Location: See LMS for TA office location
Syllabus/ Calendar
Refer to the Reading/ Assignment/ Reference list for each week (see below).
Reference material (available through RPI library - RCS login required):
- Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die (online) (RECOMMENDED)
- Big data Analytics: turning big data into big money
- Big Data Analytics: Turning Big Data into Big Money (online)
- Big Data Analytics: From Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph (online)
- Big Data Analytics with R and Hadoop (online)
- R for Everyone: Advanced Analytics and Graphics (online)
- Introduction to Statistical Learning with R - 7th Edition
General Outline of the Course Calendar:
Group 1 - Intro/ Setup
- Week 1 (Jan. 09/ Jan.12): Introduction to Course, Case Studies, and Preview of Course Material/Refresher on basic statistics.
- Assignment 1
- Week 2 (Jan. 16 / Jan. 19): Introduction/ refresher on basic statistics continue / Starting with Data and Information Resources, Role of Hypothesis, Synthesis and Model Choices, R/ RStudio introduction and Intro to Labs.
- Week 3 (Jan. 23: / Jan. 26): Introduction to Analytic Methods, Types of Data Mining for Analytics, Data filtering, hypothesis exploration, visual analysis, model consideration and assessment (lab)
- (Lab) Assignment 2
Group 2 - Patterns, relations, descriptive analytics
- Week 4 (Jan. 30 / Feb. 02): Weighted kNN, Clustering, early decision trees, Exercises for linear regression, kNN and K-means (lab), trees, plotting
- Assignment 3
- Week 5 (Feb. 06/ Feb. 09): Interpreting: Regression, Weighted kNN, Clustering, and Bayesian Inference, Exercises for clustering, plotting, Bayesian inference (lab)
- Assignment 4
- Assignment 5
- Week 6 (Feb. 13/ Feb. 16): Assignment 5 presentations (Tuesday and Friday)
- Assignment 6
- Week 7 (Feb. 20: No classes, follows Monday's schedule /Feb 23:): Lab weighted kNN, decision trees, random forest
Group 3 - Predictive Analytics
- Week 8 (Feb. 27/ Mar. 01): Cross-Validation Trees, Dimension Reduction and Multi-Dimensional Scaling
- Week 10 (Mar. 04/ Mar. 07): Spring Break No Classes
- Week 10 (Mar. 12/ Mar. 15): Support Vector Machines, Lab for Trees, DR, MDS, SVM
- Week 11 (Mar. 19/ Mar. 22): Factor Analysis, Factor Analysis lab
- Week 12 (Mar. 26/ Mar. 29): Interpreting PCA, MDS, DR, and FA , Boosting, Bootstrapping, Bagging, Boosting, Bootstrapping, Bagging (lab)
- Assignment 7
Group 4 - Evaluating and validating, prescriptive analytics
- Week 13 (Apr. 02/ Apr. 05): Cross-validation, Revisiting Regression - local methods, Lab - Cross-validation, Regression - local methods and continue project and assignment work.
- Week 13 (Apr. 09/ Apr. 12): Local Regression ctd, Mixed Models, Optimizing, Iterating, (Fischer Linear Discriminant) Prior Lab Review, Hierarchical Linear and Mixed Models, Assignment 7 due.
- Week 14 (Apr. 16/ Apr. 19): Latent Class Mixed Models, Lab, Final Project Updates
- Week 15 (Apr. 23): Last Day of Data Analytics Classes: Final Project and Poster Due
Reading/ Assignment/ Reference List (see above)
Class 1: Reading Assignment:
- Sports Analytics – Moneyball (http://www.imdb.com/title/tt1210166/),
- Nate Silver (http://en.wikipedia.org/wiki/Nate_Silver)
- http://www.slideshare.net/lsakoda/case-studies-utilizing-real-time-data-...
- http://www.marketquotient.com/case-studies.html
- http://www.ibm.com/analytics/us/en/case-studies/
Class 2 Reading Assignment: prior to Friday's class
Class 3 Reading Assignment: prior to Tuesday's class
- http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)
- http://en.wikipedia.org/wiki/Regression_analysis
- http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
- http://varianceexplained.org/r/kmeans-free-lunch/
- http://en.wikipedia.org/wiki/K-means_clustering
Classes 4-5 Reading Assignment: none
Class 6 Reading Assignment:
- http://stat-www.berkeley.edu/users/breiman/RandomForests/ Random Forests
Class 7 Reading Assignment:
- http://aquarius.tw.rpi.edu/html/DA/v15i09.pdf Karatzoglou et al. 2006
- http://aquarius.tw.rpi.edu/html/DA/svmbasic_notes.pdf Vert SVM basic
- http://aquarius.tw.rpi.edu/html/DA/svmdoc.pdf SVM documentation
- http://202.141.160.110/CRAN/web/packages/e1071/vignettes/svmdoc.pdf SVM documentation (updated 2017)
- http://www.stjuderesearch.org/site/data/ALL1/ ALL dataset
- http://www.stanford.edu/group/wonglab/RSVMpage/R-SVM.html RSVM
- http://data-informed.com/focus-predictive-analytics/ /li>
Class 8-9 Reading Assignment: None
Classes 10-13 Reading Assignment: None
Course goals:
- Introduce students to relevant methods to recognize and apply quantitative algorithms, techniques, and interpretation
- To develop students' strategic thinking skills, combined with a solid technical foundation in data and model-driven decision-making.
- Develop the ability to apply critical and analytical methods to formulate and solve science, engineering, medical, and business problems
- Students will examine real-world examples using modern cyberinfrastructure to place statistical and data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science.
- By the end of the course, students can effectively communicate analytic findings to non-specialists
Course Learning Outcomes:
- Students to demonstrate knowledge of relevant analytic methods, and to recognize and apply quantitative algorithms, techniques and interpret results.
- Students to demonstrate strategic thinking skills, combined with a solid technical foundation in data and model-driven decision-making.
- Students to develop the ability to apply critical and analytical methods to formulate and solve science, engineering, medical, and business problems.
- Students will examine real-world examples to place data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science.
- Students must effectively communicate analytic findings to non-specialists
- [6000 Levels]: Students must develop and demonstrate an ability to apply appropriate analytic techniques under conditions of uncertainty, be able to build optimization models that incorporate random parameters: static stochastic optimization, two-stage optimization with recourse, chance-constrained optimization, and sequential decision making
Academic Integrity:
Student-teacher relationships are built on trust. For example, students must trust that teachers have made appropriate decisions about the structure and content of the courses they teach, and teachers must trust that the assignments that students turn in are their own. Acts that violate this trust undermine the educational process.
The Rensselaer Handbook of Student Rights and Responsibilities and the Graduate Student Supplement (For 6000 level and above courses) define various forms of Academic Dishonesty and you should make yourself familiar with these. In this class, all assignments that are turned in for a grade must represent the student’s own work. In cases where help was received, or teamwork was allowed, a notation on the assignment should indicate your collaboration. Submission of any assignment that is in violation of this policy will result in (1) an academic (grade) penalty and (2) reporting to Associate Dean of Academic Affairs and either the Dean of Students (for Undergraduates) or the Dean of Graduate Education (for Graduate students).
In this course, the academic penalty for a first offense is zero grade for the relevant portion of the grade. A second offense will result in failure of the course.
If you have any questions concerning this policy before submitting an assignment, please ask for clarification.
Academic Accommodations:
Rensselaer Polytechnic Institute strives to make all learning experiences as accessible as possible. If you anticipate or experience academic barriers based on a disability, please let me know immediately so that we can discuss your options.
To establish reasonable accommodations, please register with The Office of Disability Services for Students (mailto:dss@rpi.edu; 518-276-8197; 4226 Academy Hall). After registration, make arrangements with me as soon as possible to discuss your accommodations so that they may be implemented in a timely fashion.”
COVID-19 code of conduct :
This code will apply to any class that meets fully or partially in an on-campus physical classroom for in-person instruction.
Violations: Refusal to comply with the COVID-19 code of conduct will be treated just as any classroom disruption, which will receive a request for immediate compliance, failing which the student will be asked to leave the classroom. Any further noncompliance will result in the dismissal of the entire class. All Covid-19 related violations will be reported by the instructor to the Compliance Officer at Lally School, and the Dean of Students. A student found to be in violation of the code, or required repeated reminders for compliance, will be asked to participate in all classes remotely. This is to protect their health and safety as well as the health and safety of their classmates, instructor, and the university community.
Masks: All students must wear a mask in classrooms and all public places including anywhere inside the building. Masks will be provided to the student by the Institute.
Traffic Flow and Social Distancing: Students and faculty will respect the need for social distancing. They are required to follow the traffic flow arrows posted in all rooms and buildings, including bathrooms and common areas.
In-Class Seating: Students should sit in the appropriate designated seating in the classroom. Students are not allowed to move furniture or sit in seats not designated by the Institute.
Cleaning of Spaces: Students are encouraged to clean the surfaces of the chairs/tables/desks they occupy before they sit down and as they prepare to leave. Cleaning and sanitizing solutions will be provided in the classroom.
Students who are ill, under quarantine for COVID-19, or suspect they are ill should not come to class. All faculty will make every reasonable effort to accommodate the student’s absence and will communicate that accommodation directly to the student. Students who need to report an illness should contact the Student Health Center via email or call 518-276-6287. For students seen off campus, a student may request an excused absence via www.bit.ly/rpiabsence with an uploaded doctor’s note that excuses them.
Course: Data Analytics
Date: to