**Eléments de la théorie de l’information**

(Information Theory for Data Science and Machine Learning)

**Course ID number:** 12x004

**Lecturer: **

Slava Voloshynovskiy, Department of Computer Science

**Teaching Assistants:**

Behrooz Razeghi (email: Behrooz.Razeghi@unige.ch) (Office Hours: Wednesday 2:30-4:00 PM)

Shideh Rezaeifar (email: Shideh.Rezaeifar@unige.ch) (Office Hours: Friday 2:30-4:00 PM)

**Language:** French/English

**Timetable**

**Lectures:** Friday 10:00-12:00, Bat A/404-407, starting from February 21, 2020**Labwork (TP):** Wednesday 8:00-10:00, Bat D/Amphi, starting from February 26, 2020

**Summary**

Information Theory is one of the main concepts of modern Data Science and Machine Learning. The Information Theory characterises different sources of information and suggests optimal performance bounds for information transmission, compression, processing and generation. At the same time, it serves as a basis for the analysis and understanding of modern machine learning methods in deep learning covering Variational Autoencoders, Generative Adversarial Networks and FLOWs.

**Content**

This class presents basic concepts of Information Theory in application to Data Science and Machine Learning. In this course, we will target to cover the following topics:

- Basic statistical data models
- Information theoretic measures
- Typicality and concentration measures
- Information theory in applications
- Data compression
- Data transmission
- Data processing (classification, regression, information bottleneck)
- Data generation

**Keywords:**

Information Theory, Data Science, Machine Learning, Data Science, Data Compression, Data Transmission, Data Processing, Data Generation.

**Learning Outcomes**

By the end of the course, the student will be able to:

- Formulate and use the fundamental concepts of information theory such as entropy, cross-entropy, KLD and mutual information
- Apply fundamental bounds and concepts from information theory in practice
- Operate with the transformation of random vectors via various linear and non-linear mappers and to characterise the effects of transformation using information theoretic measures
- Interpret various problems of information processing and analysis using information theoretic tools.

**Learning Pre-requisites**

**Required courses**

- Probability and Statistics
- Linear Algebra

**Important concepts to start the course**

Students should be familiar with linear algebra and probability

**Textbook**

**Required:**

- T. Cover and J. Thomas, "
**Elements of Information Theory**", John Wiley and Son, 2006.

**Recommended :**

- D. MacKay, "
**Information Theory, Inference, and Learning Algorithms"**, Cambrdige University Press, 2003. - R. W. Yeung, "
**First Course in Information Theory**", Kluwer Academic Publishers, 2001. - R. D. Yates, D. J. Goodman, "
**Probability and Stochastic Processes**", John Wiley and Son, 2014. - I. Csiszar, J. Koerner, "
**Information Theory: Coding Theorems For Discrete Memoryless Systems**", Academic Press, 1981. - R. W. Yeung, "
**Information Theory and Network Coding**", Springer, 2008.

**Grading**

- Oralexam or Written Exams (consisting of two parts during the semester): 2/3
- Labworks: 1/3

**Important Date**

**CC1:**Wednesday 8 April 2020.

**CC2: **Wednesday 20 May 2020.

**Problem Sets**

Problem sets will be due at the beginning of class on the due date stated below.

**February 26th (Wednesday): **Problem Set 1 out.

** March 4th (Wednesday): **Problem Set 1 due; Problem Set 2 out.

**March 11th (Wednesday): **Problem Set 2 due; Problem Set 3 out.

**March 18th (Wednesday): **Problem Set 3 due; Problem Set 4 out.

**March 25th (Wednesday): **Problem Set 4 due, Problem set 5 out.

**April 1th (Wednesday): **Problem Set 5 due, Problem set 6 out.

**April** 8th (Wednesday): **CC1**

**April** 15th (Wednesday): Holidays (VACANCES DE PÂQUES)

**April 22th (Wednesday): **Problem Set 6 due, Problem set 7 out.

**April 29th (Wednesday): **Problem Set 7 due, Problem set 8 out.

**May 6th (Wednesday): **Problem Set 8 due, Problem set 9 out.

** May 13th (Wednesday): **Problem Set 9 due.

**May** 20th (Wednesday): CC2

**Assignments**

- There will be 9 graded Problem Sets. All of them will be due in a period of one week after release.
- You will be working in Jupyter Notebooks or MATLAB (for programming problems) and in PDF or Jupyter Notebooks (for analytical problems).
- On weeks with new assignments they will be released by Wednesday 3 PM.
- The homework assignments should be submitted via
**moodle.ch**, and they will be also evaluated in this platform. PLEASE, do not submit solutions via email. - The solution of programming homework assignments will not be shared.
- Due to abundant amount of materials and limitted time, during the TP session, just selected problems will be solved. You may refer to Teaching Assistants during their Office Hours, if you have further questions.
- Homework is due on Tuesdays. There are no late days. Late submissions
**will not be accepted**.

**TP Video Lectures**

There are videos online of theextra explanations privided by Fokko Beekhof during TP Sessions (2011).