Data Analytics Spring 2021

Class Listing: ITWS 4600/ITWS 6600/ MATP 4450/ CSCI 4960/ MGMT 4962/ MGMT 6962/ BCBP 4960
Course Numbers: 43282, 44865, 43216, 44036, 42596, 43217, 43863

Instructor: Thilanka Munasinghe - munast at rpi dot edu
TA: Mukesh Mohanty - mohanm3 at rpi dot edu
Meeting times:

Section1:Location: Virtual/In-person - Time: 8:00am ET - 9:50am ET and WebEx for Remote (Online) Participation. WebEx login information available on Learning Management System (LMS). In-person location TBA
Section2:Location: Virtual/In-person – Time: 2:30 pm ET – 3:50pm ET and WebEx for Remote (Online) Participation. WebEx login information available on Learning Management System (LMS). In-person location TBA

Instructor Office Hours: Mon/Thurs 10:30am ET – 11:30am ET or by appointment via email
Instructor Office Location: Amos Eaton 133
TA Office Hours: Tuesdays 12 pm ET - 2pm ET or by email appointment
TA Office Hours Location: Virtual via WebEx : https://rensselaer.webex.com/meet/mohanm3

Syllabus/ Calendar

Refer to Reading/ Assignment/ Reference list for each week (see below).

Reference material (available through RPI library - RCS login required):

 

Group 1 - Intro/ Setup

  • Week 1 (Jan. 25/ 28): Introduction to Course, Case Studies, and Preview of Course Material/Refresher on basic statistics.
  • Assignment 1
  • Week 2 (Feb. 01/ Feb. 04): Introduction/ refresher on basic statistics continue / Starting with Data and Information Resources, Role of Hypothesis, Synthesis and Model Choices, R/ RStudio introduction and Intro to Labs.
  • Week 3 (Feb. 08: / Feb 11): Introduction to Analytic Methods, Types of Data Mining for Analytics, Data filtering, hypothesis exploration, visual analysis, model consideration and assessment (lab)
  • (Lab) Assignment 2

Group 2 - Patterns, relations, descriptive analytics

  • Week 4 (Feb. 15: [No Classes: President's Day]/ Feb. 18): Weighted kNN, Clustering, early decision trees, Exercises for linear regression, kNN and K-means (lab), trees, plotting
  • Assignment 3
  • Week 5 (Feb. 22/ Feb. 25): Interpreting: Regression, Weighted kNN, Clustering, and Bayesian Inference, Exercises for clustering, plotting, bayesian inference (lab)
  • Assignment 4
  • Assignment 5
  • Week 6 (Mar. 01/ Mar. 04):Assignment 5 presentations (Monday and Thursday)
  • Assignment 6
  • Week 7 (Mar. 08 / Mar. 11): Lab weighted kNN, decision trees, random forest

Group 3 - Predictive Analytics

  • Week 8 (Mar. 15/ Mar. 18): Cross-Validation Trees, Dimension Reduction and Multi-Dimensional Scaling
  • Week 10 (Mar. 22/ Mar. 25): Support Vector Machines, Lab for Trees, DR, MDS, SVM
  • Week 11 (Mar. 29/ Aprl. 01): Factor Analysis, Factor Analysis lab
  • Week 12 (April. 05/ April. 08): Interpreting PCA, MDS, DR, and FA , Boosting, Bootstrapping, Bagging, Boosting, Bootstrapping, Bagging (lab)
  • Assignment 7

Group 4 - Evaluating and validating, prescriptive analytics

  • Week 13 (April. 12/ April. 15): Cross-validation, Revisiting Regression - local methods, Lab - Cross-validation, Regression - local methods and continue project and assignment work
  • Week 13 (April. 19/ April. 22): Local Regression ctd, Mixed Models, Optimizing, Iterating, (Fischer Linear Discriminant)
  • Week 14 (April. 26/ April 29: Prior Lab Review, Hierarchical Linear and Mixed Models, Latent Class Mixed Models, Lab, Assignment 7 due
  • Week 15 (TBA ): Final Project Presentations

Reading/ Assignment/ Reference List (see above)

Class 1: Reading Assignment:

 

Class 2 Reading Assignment: prior to Thursday class

Class 3 Reading Assignment: prior to Monday class

 

Classes 4-5 Reading Assignment: none

Class 6 Reading Assignment:

Class 7 Reading Assignment:

 

Class 8-9 Reading Assignment: None

Classes 10-13 Reading Assignment: None

Course goals:

  • Introduce students to relevant methods to recognize and apply quantitative algorithms, techniques, and interpretation
  • To develop students' strategic thinking skills, combined with a solid technical foundation in data and model-driven decision-making.
  • Develop the ability to apply critical and analytical methods to formulate and solve science, engineering, medical, and business problems
  • Students will examine real-world examples using modern cyberinfrastructure to place statistical and data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science.
  • By the end of the course, students can effectively communicate analytic findings to non-specialists

Course Learning Objectives:

  • Students to demonstrate knowledge of relevant analytic methods, and to recognize and apply quantitative algorithms, techniques and interpret results
  • Students to demonstrate strategic thinking skills, combined with a solid technical foundation in data and model-driven decision-making.
  • Students to develop the ability to apply critical and analytical methods to formulate and solve science, engineering, medical, and business problems
  • Students will examine real-world examples to place data-mining techniques in context, to develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science.
  • Students must effectively communicate analytic findings to non-specialists.
  • [graduate level]
    Students must develop and demonstrate a working knowledge of decision making under uncertainty, be able to build optimization models that incorporate random parameters: static stochastic optimization, two-stage optimization with recourse, chance-constrained optimization, and sequential decision making.

COVID-19 code of conduct :
This code will apply to any class that meets fully or partially in an on-campus physical classroom for in-person instruction.
Violations: Refusal to comply with the COVID-19 code of conduct will be treated just as any classroom disruption, which will receive a request for immediate compliance, failing which the student will be asked to leave the classroom. Any further noncompliance will result in the dismissal of the entire class. All Covid-19 related violations will be reported by the instructor to the Compliance Officer at Lally School, and the Dean of Students. A student found to be in violation of the code, or required repeated reminders for compliance, will be asked to participate in all classes remotely. This is to protect their health and safety as well as the health and safety of their classmates, instructor, and the university community.

Masks: All students must wear a mask in classrooms and all public places including anywhere inside the building. Masks will be provided to the student by the Institute.

Traffic Flow and Social Distancing: Students and faculty will respect the need for social distancing. They are required to follow the traffic flow arrows posted in all rooms and buildings, including bathrooms and common areas.

In-Class Seating: Students should sit in the appropriate designated seating in the classroom. Students are not allowed to move furniture or sit in seats not designated by the Institute.

Cleaning of Spaces: Students are encouraged to clean the surfaces of the chairs/tables/desks they occupy before they sit down and as they prepare to leave. Cleaning and sanitizing solutions will be provided in the classroom.

Students who are ill, under quarantine for COVID-19, or suspect they are ill should not come to class. All faculty will make every reasonable effort to accommodate the student’s absence and will communicate that accommodation directly to the student. Students who need to report an illness should contact the Student Health Center via email or call 518-276-6287. For students seen off campus, a student may request an excused absence via www.bit.ly/rpiabsence with an uploaded doctor’s note that excuses them.


Course: Data Analytics

Date: to