Class XY

Recent developments in neural network (aka “deep learning”) have drastically advanced the performance of machine perception systems in a variety of areas including drones, self-driving cars and intelligent UIs. This course is a deep dive into details of the deep learning algorithms and architectures for a variety of perceptual tasks.


eDoz Course Nr.
O. Hilliges, S. Tang
A. Spurr, S. Christen, Z. Fan, M. Kaufmann, M. Mihajlovic, S. Wang, Y. Zhang
Wed 13 – 14, live streamed via Zoom. Zoom Link Lecture
Thu 12 – 14, live streamed via Zoom.
Thu 14 – 16, live streamed via Zoom. Zoom Link Exercise
Fri 14 – 16, live streamed via Zoom.
The lecture recordings will be made available.
Please address all questions regarding content, organisation etc. on Piazza. Please sign up to the forum using this link. The forum is closely monitored by us. Due to organisational reasons, we will not be able to respond to direct e-mails.


The first lecture will take place on Thursday, 25th of February. The Wednesday lecture starts in the second week of the semester.
Both the lecture and the exercise will be held virtually via Zoom for the entirety of the semester. The sessions will be recorded and made available for offline viewing.
Please sign up to Piazza.

Learning Objectives

Students will learn about fundamental aspects of modern deep learning approaches for perception. Students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in learning-based computer vision, robotics and HCI. The final project assignment will involve training a complex neural network architecture and applying it on a real-world dataset.

The core competency acquired through this course is a solid foundation in deep-learning algorithms to process and interpret human input into computing systems. In particular, students should be able to develop systems that deal with the problem of recognizing people in images, detecting and describing body parts, inferring their spatial configuration, performing action/gesture recognition from still images or image sequences, also considering multi-modal data, among others.

Wk Date Content Material Exercise
1 25.02 Deep Learning Introduction
Class content & admin, Feedforward Networks,Representation Learning
Material Exercise
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .