As part of the special “transverse program” of PSL, we organize with
the help of PSL a special 1 week course on the topic of “Machine
Learning in Genomics”. This course is open in priority to Master 2
students of PSL. It is also open to other students (Master and PhD)
and researchers, subject to availability.
Important: Master students should check with their master’s
administration if this course can be used to validate one of their
master course.
Dates and location
Dates 4th - 8th April, 2022
Location Parisanté Campus, 10 Rue d’Oradour-sur-Glane, 92130 Issy-les-Moulineaux
Pre-register to the Courses
Pre-registration is free but mandatory.
PSL students have priority if they pre-register before January 31st.
Courses Content
This week-long course will be split into three types of classes:
theory, practice and project. During the project session, students
will work in small groups toward a case study of practical importance.
The final examination of the course will be a short presentation of
the projects.
Schedule
Day 1: 4/04, Morning:
- Biologists: 9h-12h Andrei Zinovyev, introduction to machine learning
- Math/Machine Learning students: 9h-12h Hugues Roest Crollius, introduction to genomics
Day 1: 4/04, Afternoon: Flora Jay, Jean Cury (population genetics)
- 14h-15h Intro to population genetics problematics and datasets, focus on selection or demographic inference, presentation of inference methods based on summary statistics versus SNP data (ABC, NN).
- 15h-17h hands-on session
- 17h-18h Presentation of the project for final evaluation and creation of groups
Day 2: 5/04, Morning: Camille and Franklin (disease variant identification)
- 9h-10h: Introduction to genomic medicine and population sequencing, problematics of functional variant identification for human health, introduction to decision trees, random forests and neural networks and applications to genetic disease variants identification, validation methods
- 10-12h: hands-on session using random forests and NNs to train classifiers and identify candidate genetic variants involved in human diseases
Day 2: 5/04, Afternoon: Laura and Anais (multi-omics integration: dimensionality reduction)
- 14h-15h Introduction on multi-omics integration in biology, multi-omics dimensionality reduction (special focus Matrix factorisation, small picture of AE) and main existing tools.
- 15h-17h hands-on session using MOFA to integrate multi-omics data
- 17h-18h Groups working on project for final evaluation
Day 3: 6/04, Morning: Laura and Anais (multi-omics integration: Networks)
- 9h-10h Network science introduction + main networks in biology (inference + measured networks) classical measures and algorithms + RWR + MOGAMUN
- 10h-12h hands-on session multiplex topology, active modules, MOGAMUN and visualisation with Cytoscape.
Day 3: 6/04, Afternoon:
- 14h-18h Q&A projects and general questions
Day 4: 7/04, Morning: Chloé (feature selection, GWAS)
- 9h-10h Genome-Wide Association Studies, multiple hypotheses testing, lasso
- 10h-12h Hands-on session part 1: GWAS of two Arabidopsis thaliana phenotypes
Day 4: 7/04, Afternoon: Chloé
- 14h-15h Other regularizers (elastic net, graph-based regularization, multitask approaches)
- 15h-17h Hands-on session part 2: GWAS of two Arabidopsis thaliana phenotypes
- 17h-18h Groups working on project for final evaluation
Day 5: 8/04, Morning: Paul (image analysis, combining images with other omics)
- 9h-10h : Introduction to imaging for the study of gene expression - Microscopes, in situ hybridization, gene reporters… Methods for extracting information from large image datasets, coupling single cell RNASeq with microscopy. Data integration as a 1) semi-supervised learning problem 2) optimal transport problem 3) domain translation with autoencoders
- 10h-12h data integration from multiple heterogeneous datasets or inference of spatio-temporal dynamics from single cell RNASeq
Day 5: 8/04, Afternoon: Paul with talk from Thomas Walter
- 14h-15h Thomas Walter
- 15h-17h data integration from multiple heterogeneous datasets or inference of spatio-temporal dynamics from single cell RNASeq
- 17h-18h Groups working on project for final evaluation
Teachers