Data Analytics Spring 2023

Class Listing: ITWS 4600/ITWS 6600/ MATP 4450/ CSCI 4600/ MGMT 4962/ MGMT 6600/ BCBP 4960 

Course Numbers: 77982, 78861, 79589, 78862, 79427,  80323, 80328,  80380   

Instructor: Thilanka Munasinghe - munast at rpi dot edu 

TA: Megan Yu - yuy12 at rpi dot edu 

Meeting times:

Section1:Time/Location: In-person - Tue/Fri: Time: 10:00am ET - 11:50am ET ; Location: Lally 104

Section2:Time/Location: In-person – Tue/Fri: Time: 2:00 pm ET – 3:50pm ET ; Location: CII 3206

Instructor Office Hours: Tuesdays/Fridays 12:30 pm ET – 1:30 pm ET or by appointment via email 

Instructor Office Location: Lally 315

TA Office Hours: See LMS for TA office hours information 

TA Office Hours Location: LMS for TA office location

Syllabus/ Calendar

Refer to the Reading/ Assignment/ Reference list for each week (see below).

Reference material (available through RPI library - RCS login required):

Group 1 - Intro/ Setup

  • Week 1 (Jan. 10/ Jan. 13): Introduction to Course, Case Studies, and Preview of Course Material/Refresher on basic statistics.
  • Assignment 1
  • Week 2 (Jan. 17/ Jan. 20): Introduction/ refresher on basic statistics continue / Starting with Data and Information Resources, Role of Hypothesis, Synthesis and Model Choices, R/ RStudio introduction and Intro to Labs.
  • Week 3 (Jan. 24: / Jan. 27): Introduction to Analytic Methods, Types of Data Mining for Analytics, Data filtering, hypothesis exploration, visual analysis, model consideration and assessment (lab)
  • (Lab) Assignment 2

Group 2 - Patterns, relations, descriptive analytics

  • Week 4 (Jan. 31 / Feb. 03): Weighted kNN, Clustering, early decision trees, Exercises for linear regression, kNN and K-means (lab), trees, plotting
  • Assignment 3
  • Week 5 (Feb. 07/ Feb. 10): Interpreting: Regression, Weighted kNN, Clustering, and Bayesian Inference, Exercises for clustering, plotting, bayesian inference (lab)
  • Assignment 4
  • Assignment 5
  • Week 6 (Feb. 14/ Feb. 17): Assignment 5 presentations (Tuesday and Friday)
  • Assignment 6
  • Week 7 (Feb. 21:No classes, follows Monday's schedule / Feb. 24): Lab weighted kNN, decision trees, random forest

Group 3 - Predictive Analytics

  • Week 8 (Feb. 28/ Mar. 03): Cross-Validation Trees, Dimension Reduction, and Multi-Dimensional Scaling
  • Week 10 (Mar. 07/ Mar. 10): Spring Break: No Classes
  • Week 10 (Mar. 14/ Mar. 17): Support Vector Machines, Lab for Trees, DR, MDS, SVM
  • Week 11 (Mar. 21/ Mar. 24): Factor Analysis, Factor Analysis lab
  • Week 12 (Mar. 28/ Mar. 31): Interpreting PCA, MDS, DR, and FA, Boosting, Bootstrapping, Bagging, Boosting, Bootstrapping, Bagging (lab)
  • Assignment 7

Group 4 - Evaluating and validating, prescriptive analytics

  • Week 13 (April. 04/ April. 07): Cross-validation, Revisiting Regression - local methods, Lab - Cross-validation, Regression - local methods and continue project and assignment work
  • Week 13 (April. 11/ April. 14): Local Regression ctd, Mixed Models, Optimizing, Iterating, (Fischer Linear Discriminant)
  • Week 14 (April. 18/ April. 21): Prior Lab Review, Hierarchical Linear and Mixed Models, Latent Class Mixed Models, Lab, Assignment 7 due
  • Week 15 (April. 25: Last Day of Data Analytics Classes): Final Project and Poster Due

Reading/ Assignment/ Reference List (see above)

Class 1: Reading Assignment:

Class 2 Reading Assignment: prior to Friday's class

Class 3 Reading Assignment: prior to Tuesday's class

Classes 4-5 Reading Assignment: none

Class 6 Reading Assignment:

Class 7 Reading Assignment:

Class 8-9 Reading Assignment: None

Classes 10-13 Reading Assignment: None

Course goals:

  • Introduce students to relevant methods to recognize and apply quantitative algorithms, techniques, and interpretation
  • To develop students' strategic thinking skills, combined with a solid technical foundation in data and model-driven decision-making.
  • Develop the ability to apply critical and analytical methods to formulate and solve science, engineering, medical, and business problems
  • Students will examine real-world examples using modern cyberinfrastructure to place statistical and data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science.
  • By the end of the course, students can effectively communicate analytic findings to non-specialists

Course Learning Outcomes:

  • Students to demonstrate knowledge of relevant analytic methods, and to recognize and apply quantitative algorithms, techniques and interpret results.
  • Students to demonstrate strategic thinking skills, combined with a solid technical foundation in data and model-driven decision-making.
  • Students to develop the ability to apply critical and analytical methods to formulate and solve science, engineering, medical, and business problems.
  • Students will examine real-world examples to place data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science.
  • Students must effectively communicate analytic findings to non-specialists
  • [6000 Levels]: Students must develop and demonstrate an ability to apply appropriate analytic techniques under conditions of uncertainty, be able to build optimization models that incorporate random parameters: static stochastic optimization, two-stage optimization with recourse, chance-constrained optimization, and sequential decision making

Academic Integrity:

Student-teacher relationships are built on trust. For example, students must trust that teachers have made appropriate decisions about the structure and content of the courses they teach, and teachers must trust that the assignments that students turn in are their own. Acts that violate this trust undermine the educational process. 

The Rensselaer Handbook of Student Rights and Responsibilities and the Graduate Student Supplement (For 6000 level and above courses) define various forms of Academic Dishonesty and you should make yourself familiar with these. In this class, all assignments that are turned in for a grade must represent the student’s own work. In cases where help was received, or teamwork was allowed, a notation on the assignment should indicate your collaboration. Submission of any assignment that is in violation of this policy will result in (1) an academic (grade) penalty and (2) reporting to Associate Dean of Academic Affairs and either the Dean of Students (for Undergraduates) or the Dean of Graduate Education (for Graduate students). 

In this course, the academic penalty for a first offense is zero grade for the relevant portion of the grade. A second offense will result in failure of the course. 

If you have any questions concerning this policy before submitting an assignment, please ask for clarification.

Academic Accommodations:

Rensselaer Polytechnic Institute strives to make all learning experiences as accessible as possible. If you anticipate or experience academic barriers based on a disability, please let me know immediately so that we can discuss your options. 

To establish reasonable accommodations, please register with The Office of Disability Services for Students (mailto:dss@rpi.edu; 518-276-8197; 4226 Academy Hall). After registration, make arrangements with me as soon as possible to discuss your accommodations so that they may be implemented in a timely fashion.”

COVID-19 code of conduct :  

This code will apply to any class that meets fully or partially in an on-campus physical classroom for in-person instruction. 

Violations: Refusal to comply with the COVID-19 code of conduct will be treated just as any classroom disruption, which will receive a request for immediate compliance, failing which the student will be asked to leave the classroom. Any further noncompliance will result in the dismissal of the entire class. All Covid-19 related violations will be reported by the instructor to the Compliance Officer at Lally School, and the Dean of Students. A student found to be in violation of the code, or required repeated reminders for compliance, will be asked to participate in all classes remotely. This is to protect their health and safety as well as the health and safety of their classmates, instructor, and the university community.

Masks: All students must wear a mask in classrooms and all public places including anywhere inside the building. Masks will be provided to the student by the Institute.

Traffic Flow and Social Distancing: Students and faculty will respect the need for social distancing. They are required to follow the traffic flow arrows posted in all rooms and buildings, including bathrooms and common areas.

In-Class Seating: Students should sit in the appropriate designated seating in the classroom. Students are not allowed to move furniture or sit in seats not designated by the Institute.

Cleaning of Spaces: Students are encouraged to clean the surfaces of the chairs/tables/desks they occupy before they sit down and as they prepare to leave. Cleaning and sanitizing solutions will be provided in the classroom.

Students who are ill, under quarantine for COVID-19, or suspect they are ill should not come to class. All faculty will make every reasonable effort to accommodate the student’s absence and will communicate that accommodation directly to the student. Students who need to report an illness should contact the Student Health Center via email or call 518-276-6287. For students seen off campus, a student may request an excused absence via www.bit.ly/rpiabsence with an uploaded doctor’s note that excuses them.


Course: Data Analytics

Date: to