NPTEL : NOC:Deep Learning (Computer Science and Engineering)

Co-ordinators : Prof.Mitesh Khapra


Lecture 1 - Biological Neuron

Lecture 2 - From Spring to Winter of AI

Lecture 3 - The Deep Revival

Lecture 4 - From Cats to Convolutional Neural Networks

Lecture 5 - Faster, higher, stronger

Lecture 6 - The Curious Case of Sequences

Lecture 7 - Beating humans at their own games (literally)

Lecture 8 - The Madness (2013)

Lecture 9 - (Need for) Sanity

Lecture 10 - Motivation from Biological Neurons

Lecture 11 - McCulloch Pitts Neuron, Thresholding Logic

Lecture 12 - Perceptrons

Lecture 13 - Error and Error Surfaces

Lecture 14 - Perceptron Learning Algorithm

Lecture 15 - Proof of Convergence of Perceptron Learning Algorithm

Lecture 16 - Deep Learning (CS7015): Linearly Separable Boolean Functions

Lecture 17 - Deep Learning (CS7015): Representation Power of a Network of Perceptrons

Lecture 18 - Deep Learning (CS7015): Sigmoid Neuron

Lecture 19 - Deep Learning (CS7015): A typical Supervised Machine Learning Setup

Lecture 20 - Deep Learning (CS7015): Learning Parameters: (Infeasible) guess work

Lecture 21 - Deep Learning (CS7015): Learning Parameters: Gradient Descent

Lecture 22 - Deep Learning (CS7015): Representation Power of Multilayer Network of Sigmoid Neurons

Lecture 23 - Feedforward Neural Networks (a.k.a multilayered network of neurons)

Lecture 24 - Learning Paramters of Feedforward Neural Networks (Intuition)

Lecture 25 - Output functions and Loss functions

Lecture 26 - Backpropagation (Intuition)

Lecture 27 - Backpropagation: Computing Gradients w.r.t. the Output Units

Lecture 28 - Backpropagation: Computing Gradients w.r.t. Hidden Units

Lecture 29 - Backpropagation: Computing Gradients w.r.t. Parameters

Lecture 30 - Backpropagation: Pseudo code

Lecture 31 - Derivative of the activation function

Lecture 32 - Information content, Entropy and cross entropy

Lecture 33 - Recap: Learning Parameters: Guess Work, Gradient Descent

Lecture 34 - Contours Maps

Lecture 35 - Momentum based Gradient Descent

Lecture 36 - Nesterov Accelerated Gradient Descent

Lecture 37 - Stochastic And Mini-Batch Gradient Descent

Lecture 38 - Tips for Adjusting Learning Rate and Momentum

Lecture 39 - Line Search

Lecture 40 - Gradient Descent with Adaptive Learning Rate

Lecture 41 - Bias Correction in Adam

Lecture 42 - Eigenvalues and Eigenvectors

Lecture 43 - Linear Algebra : Basic Definitions

Lecture 44 - Eigenvalue Decompositon

Lecture 45 - Principal Component Analysis and its Interpretations

Lecture 46 - PCA: Interpretation 2

Lecture 47 - PCA: Interpretation 3

Lecture 48 - PCA: Interpretation 3 (Continued...)

Lecture 49 - PCA: Practical Example

Lecture 50 - Singular Value Decomposition

Lecture 51 - Introduction to Autoncoders

Lecture 52 - Link between PCA and Autoencoders

Lecture 53 - Regularization in autoencoders (Motivation)

Lecture 54 - Denoising Autoencoders

Lecture 55 - Sparse Autoencoders

Lecture 56 - Contractive Autoencoders

Lecture 57 - Bias and Variance

Lecture 58 - Train error vs Test error

Lecture 59 - Train error vs Test error (Recap)

Lecture 60 - True error and Model complexity

Lecture 61 - L2 regularization

Lecture 62 - Dataset augmentation

Lecture 63 - Parameter sharing and tying

Lecture 64 - Adding Noise to the inputs

Lecture 65 - Adding Noise to the outputs

Lecture 66 - Early stopping

Lecture 67 - Ensemble Methods

Lecture 68 - Dropout

Lecture 69 - A quick recap of training deep neural networks

Lecture 70 - Unsupervised pre-training

Lecture 71 - Better activation functions

Lecture 72 - Better initialization strategies

Lecture 73 - Batch Normalization

Lecture 74 - One-hot representations of words

Lecture 75 - Distributed Representations of words

Lecture 76 - SVD for learning word representations

Lecture 77 - SVD for learning word representations (Continued...)

Lecture 78 - Continuous bag of words model

Lecture 79 - Skip-gram model

Lecture 80 - Skip-gram model (Continued...)

Lecture 81 - Contrastive estimation

Lecture 82 - Hierarchical softmax

Lecture 83 - GloVe representations

Lecture 84 - Evaluating word representations

Lecture 85 - Relation between SVD and Word2Vec

Lecture 86 - The convolution operation

Lecture 87 - Relation between input size, output size and filter size

Lecture 88 - Convolutional Neural Networks

Lecture 89 - Convolutional Neural Networks (Continued...)

Lecture 90 - CNNs (success stories on ImageNet)

Lecture 91 - CNNs (success stories on ImageNet) (Continued...)

Lecture 92 - Image Classification continued (GoogLeNet and ResNet)

Lecture 93 - Visualizing patches which maximally activate a neuron

Lecture 94 - Visualizing filters of a CNN

Lecture 95 - Occlusion experiments

Lecture 96 - Finding influence of input pixels using backpropagation

Lecture 97 - Guided Backpropagation

Lecture 98 - Optimization over images

Lecture 99 - Create images from embeddings

Lecture 100 - Deep Dream

Lecture 101 - Deep Art

Lecture 102 - Fooling Deep Convolutional Neural Networks

Lecture 103 - Sequence Learning Problems

Lecture 104 - Recurrent Neural Networks

Lecture 105 - Backpropagation through time

Lecture 106 - The problem of Exploding and Vanishing Gradients

Lecture 107 - Some Gory Details

Lecture 108 - Selective Read, Selective Write, Selective Forget - The Whiteboard Analogy

Lecture 109 - Long Short Term Memory (LSTM) and Gated Recurrent Units (GRUs)

Lecture 110 - How LSTMs avoid the problem of vanishing gradients

Lecture 111 - How LSTMs avoid the problem of vanishing gradients (Continued...)

Lecture 112 - Introduction to Encoder Decoder Models

Lecture 113 - Applications of Encoder Decoder models

Lecture 114 - Attention Mechanism

Lecture 115 - Attention Mechanism (Continued...)

Lecture 116 - Attention over images

Lecture 117 - Hierarchical Attention