DA 202-O Introduction to Data Science 3:1 (August 2021)

Course Instructor: Sashikumaar Ganesan, CDS and Deepak Subramani, CDS

Course description: This course will provide an introduction to Data Science. This four-credit course will be offered every year in the August-December term as a core course to MTech (online) programme. This course is aimed to be an introductory graduate-level (200-series) course.

Syllabus

Data Science Fundamentals: Identifying and framing a data science problem in different fields; Data - Types, Pre-processing; Different types of Analytics; Introduction to Machine Learning, Artificial Intelligence.

Basic Programming: Data structures, if-else, loops; Visualization; Handling structured data

Probability: Probability axioms, Conditional Probability, Bayes' Theorem, Independence, Counting Problems, Discrete and Continuous Random Variables, Expectation, Iterated Expectation, Total Law of Probability, Covariance, Correlation, Entropy, Mutual Information.

Computational Methods: Calculus for Data Science: Functions, Derivative, Partial derivative, Gradient of vector-valued functions and matrices and automatic differentiation, Second derivative Hessian matrix.

Linear Algebra: Vectors, Basis, Linear Dependence and Independence, Tensors, Scalars, Inner Products, Outer product, Norms, Basis, Orthogonal and Orthonormal Vectors, Orthogonalization and Normalization.

Matrix Linear transformation: Frobenius Norm, Matrix Multiplication, Solutions of system of algebraic equations; Matrix Decomposition: QR Factorization, Singular Value Decomposition; Cholesky Decomposition, Eigen Value Decomposition.

Textbooks / References

  1. Shah, Chirag. A Hands-On Introduction to Data Science. Cambridge University Press, 2020.
  2. Bertsekas, Dimitri P., and John N. Tsitsiklis. Introduction to Probability. Vol. 1. Belmont, MA: Athena Scientific, 2002.
  3. Shaw, Zed A. Learn python 3 the hard way: A very simple introduction to the terrifyingly beautiful world of computers and code. Addison-Wesley Professional, 2017.
  4. G4. Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong. Mathematics for Machine Learning. Cambridge University Press, 2020. (https://mml-book.github.io)
  5. Gibert Strang. Linear Algebra and Learning from Data. Wellesley-Cambridge Press, 2019
  6. Gibert Strang. Linear Algebra for Everyone, Wellesley-Cambridge Press, 2020

Prerequisites: Basic knowledge of mathematics

Grading:

  • Homework assignments (four) 40%
  • Mid-term exam 30%
  • Final exam 30%.