Introduction to Deep Learning & Applications
This course covers the fundamentals of deep learning and the basics of deep neural networks, including different network architectures (e.g., ConvNet, RNN) and optimization algorithms for training these networks, as well as applications to computer vision, robotics, and sequence modeling.
- Introduction
- Lecture 1: Image Classification Methods
- Nearest neighbor
- Linear classifiers
- Tutorial: Jupyter Notebooks
- Lecture 2: Image Classification: Linear Classifiers and Optimization
- More linear classifiers
- Optimization
- Regularization
- Lecture 3: Multi-Layer Perceptrons and Back-Propagation
- Multi-layer neural networks
- Training neural networks with back-propagation
- Softmax
- Lecture 4: Convolutional Neural Networks (CNNs)
- Lecture 5: Different Elements in Training CNNs (1/2)
- Optimization methods
- Learning rate
- Recall activation function
- Gradients from softmax loss
- Lecture 6: Different Elements in Training CNNs (2/2)
- Data augmentation and pro-processing
- Weight initialization
- Batch normalization
- Regularization in training deep networks
- Tutorial: Pytorch
- Lecture 7: Convolutional Neural Network Architectures
- Finetuning with CNNs
- Developments and insights of CNN architectures
- Lecture 8: Semantic Segmentation
- The naïve fully convolutional network (FCN) model for image segmentation
- Transpose convolution
- Advanced segmentation techniques
- Lecture 9: Visualizing Deep Networks
- Saliency maps
- Maximizing activation
- Quantifying unit interpretability
- Lecture 10: Object Detection 1: Box
- Background and old fashioned object detection
- 2-stage object detection
- Feature pyramid networks
- Lecture 11: Object Detection 2: Mask and Pose
- Lecture 12: Recurrent Neural Networks (RNNs)
- The basic RNN
- Long short-term memory (LSTM)
- Applications to language and vision tasks
- Lecture 13: Video Understanding
- 2-stream networks for action recognition
- Temporal and 3D convolution
- Lecture 14: Video Prediction
- Interaction networks for physical prediction
- Prediction space and time
- Lecture 15: Self-Attention and Transformers
- Non-local neural networks for videos
- Self-attention and transformers for natural language processing (NLP)
- Lecture 16: Conditional Generative Adversarial Networks
- Image-to-image translation: pix2pix
- Unpaired image-to-image translation: CycleGAN
- Other applications of adversarial learning