NPTEL : NOC:Scalable Data Science (Computer Science and Engineering)

Co-ordinators : Prof. Sourangshu Bhattacharya, Prof. Anirban Dasgupta


Lecture 1 - Background: Introduction

Lecture 2 - Probability: Concentration inequalities

Lecture 3 - Linear algebra: PCA, SVD

Lecture 4 - Optimization: Basics, Convex, GD

Lecture 5 - Machine Learning: Supervised, generalization, feature learning, clustering.

Lecture 6 - Memory-efficient data structures: Hash functions, universal / perfect hash families

Lecture 7 - Bloom filters

Lecture 8 - Sketches for distinct count

Lecture 9 - Sketches for distinct count (Continued...)

Lecture 10 - Misra-Gries sketch

Lecture 11 - Frequent Element: Space Saving and Count Min

Lecture 12 - Frequent Element: Count Sketch

Lecture 13 - Near Neighbors

Lecture 14 - Locality Sensitive Hashing

Lecture 15 - Building LSH Tables

Lecture 16 - Approximate near neighbors search: Extensions e.g. multi-probe, b-bit hashing, Data dependent variants

Lecture 17 - Approximate near neighbors search: Extensions e.g. multi-probe, b-bit hashing, Data dependent variants (Continued...)

Lecture 18 - Approximate near neighbors search: Extensions e.g. multi-probe, b-bit hashing, Data dependent variants (Continued...)

Lecture 19 - Randomized Numerical Linear Algebra: Random projection

Lecture 20 - Randomized Numerical Linear Algebra: Random projection (Continued...)

Lecture 21 - Randomized Numerical Linear Algebra: a) Matrix multiplication + QB decomposition

Lecture 22 - Randomized Numerical Linear Algebra: b) CUR+CX

Lecture 23 - Randomized Numerical Linear Algebra: a) L2 regression using RP

Lecture 24 - Randomized Numerical Linear Algebra: b) Leverage scores

Lecture 25 - Randomized Numerical Linear Algebra: c) Hash Kernels + Kitchen Sink

Lecture 26 - Map-reduce and Hadoop

Lecture 27 - Hadoop System

Lecture 28 - Hadoop System (Continued...)

Lecture 29 - Hadoop System (Continued...)

Lecture 30 - Spark

Lecture 31 - Spark (Continued...)

Lecture 32 - Spark (Continued...)

Lecture 33 - Distributed Machine Learning and Optimization: Introduction

Lecture 34 - SGD+Proof

Lecture 35 - SGD+Proof (Continued...)

Lecture 36 - Distributed Machine Learning and Optimization:ADMM + applications

Lecture 37 - Distributed Machine Learning and Optimization:ADMM + applications (Continued...)

Lecture 38 - Clustering

Lecture 39 - Clustering (Continued...)

Lecture 40 - Conclusion