Eléments de la théorie de l’information

(Information Theory for Data Science and Machine Learning)


Course ID number: 12x004

Lecturer: 

Slava Voloshynovskiy, Department of Computer Science

Teaching Assistants:

Behrooz Razeghi (email: Behrooz.Razeghi@unige.ch) (Office Hours: Wednesday 2:30-4:00 PM)

Shideh Rezaeifar (email: Shideh.Rezaeifar@unige.ch) (Office Hours: Friday 2:30-4:00 PM)

Language: French/English

Timetable

Lectures: Friday 10:00-12:00, Bat A/404-407, starting from February 21, 2020
Labwork (TP): Wednesday 8:00-10:00, Bat D/Amphi, starting from February 26, 2020

Summary

Information Theory is one of the main concepts of modern Data Science and Machine Learning. The Information Theory characterises different sources of information and suggests optimal performance bounds for information transmission, compression, processing and generation. At the same time, it serves as a basis for the analysis and understanding of modern machine learning methods in deep learning covering Variational Autoencoders, Generative Adversarial Networks and FLOWs.

Content

This class presents basic concepts of Information Theory in application to Data Science and Machine Learning. In this course, we will target to cover the following topics:

  1. Basic statistical data models
  2. Information theoretic measures
  3. Typicality and concentration measures
  4. Information theory in applications
  5. Data compression
  6. Data transmission
  7. Data processing (classification, regression, information bottleneck)
  8. Data generation

Keywords:

Information Theory, Data Science, Machine Learning, Data Science, Data Compression, Data Transmission, Data Processing, Data Generation.

Learning Outcomes

By the end of the course, the student will be able to:

  • Formulate and use the fundamental concepts of information theory such as entropy, cross-entropy, KLD and mutual information
  • Apply fundamental bounds and concepts from information theory in practice
  • Operate with the transformation of random vectors via various linear and non-linear mappers and to characterise the effects of transformation using information theoretic measures
  • Interpret various problems of information processing and analysis using information theoretic tools.

Learning Pre-requisites

Required courses
  • Probability and Statistics
  • Linear Algebra
Important concepts to start the course

Students should be familiar with linear algebra and probability

Textbook

Required:

Recommended:

Grading

The final note is composed of:
  • Oralexam or Written Exams (consisting of two parts during the semester): 2/3
  • Labworks: 1/3

Important Date

CC1: Wednesday 8 April 2020.

CC2: Wednesday 20 May 2020.

Problem Sets

Problem sets will be due at the beginning of class on the due date stated below.

February 26th (Wednesday): Problem Set 1 out.

March 4th (Wednesday): Problem Set 1 due; Problem Set 2 out.

March 11th (Wednesday): Problem Set 2 due; Problem Set 3 out.

March 18th (Wednesday): Problem Set 3 due; Problem Set 4 out.

March 25th (Wednesday): Problem Set 4 due, Problem set 5 out.

April 1th (Wednesday): Problem Set 5 due, Problem set 6 out.

April 8th (Wednesday): CC1

April 15th (Wednesday): Holidays (VACANCES DE PÂQUES)

April 22th (Wednesday): Problem Set 6 due, Problem set 7 out.

April 29th (Wednesday): Problem Set 7 due, Problem set 8 out.

May 6th (Wednesday): Problem Set 8 due, Problem set 9 out.

May 13th (Wednesday): Problem Set 9 due.

May 20th (Wednesday): CC2


Assignments

  • There will be 9 graded Problem Sets. All of them will be due in a period of one week after release. 
  • You will be working in Jupyter Notebooks or MATLAB (for programming problems) and in PDF or Jupyter Notebooks (for analytical problems). 
  • On weeks with new assignments they will be released by Wednesday 3 PM.
  • The homework assignments should be submitted via moodle.ch, and they will be also evaluated in this platform. PLEASE, do not submit solutions via email. 
  • The solution of programming homework assignments will not be shared. 
  • Due to abundant amount of materials and limitted time, during the TP session, just selected problems will be solved. You may refer to Teaching Assistants during their Office Hours, if you have further questions. 
  • Homework is due on Tuesdays. There are no late days. Late submissions will not be accepted.

TP Video Lectures

There are videos online of theextra explanations privided by Fokko Beekhof during TP Sessions (2011).