Statistics for Data Science

Statistics for Data Science – MATH 413

 

Instructor: Prof. Victor Panaretos

Assistants: Laya Ghodrati, Tomas Masak, Matthieu Simeoni, Kartik Waghmare

 

Announcements

  • The exam takes place in rooms PO 01 (Polydome) and CE4 on January 31, 8:15-11:15. Students with their last names starting from A to L are assigned to PO 01. Students with their last names starting from M to Z are assigned to CE4. Please come at 8:00 and wait outside of the respective room. You are allowed to bring 3 standard A4 sheets (i.e. 6 sides) of standard hand-written notes.

Description

Statistics lies at the foundation of data science, providing a unifying theoretical and methodological backbone for the diverse tasks enountered in this emerging field. This course rigorously develops the key notions and methods of statistics, with an emphasis on concepts rather than techniques.

Topics include:

  • Probability background.
  • Entropy and Exponential Families.
  • Sampling Theory: information and stochastic convergence.
  • Bias and Variance, and the Cramér-Rao bound.
  • Likelihood theory.
  • Testing and Confidence Regions.
  • Nonparametric Estimation and Smoothing.
  • Gaussian Linear Regression.
  • Generalised Linear Models.
  • Nonparametric Regression.

Required prior knowledge

Introductory courses in probability and statistics; Basic analysis and linear algebra.

Recommended Texts

Davison, A.C. (2003). Statistical Models, Cambridge.
Panaretos, V.M. (2016). Statistics for Mathematicians. Birkhäuser.
Wasserman, L. (2004). All of Statistics. Springer.
Friedman, J., Hastie, T. and Tibshirani, R. (2010). The Elements of Statistical Learning. Springer

Exam Information

There will be a written midterm exam (November 19) and a written final exam during the examination period.

Fall 2018 Schedule

Lectures: CE 1 3 Mondays, 12:15-14:00
CM 1 5 Tuesdays, 14:15-16:00
Exercises: CE 1 100 + CE 1 101 Wednesdays, 13:15-15:00