Week 1: Statistical learning

February 1 – 5, 2016
This week will be devoted to statistical learning from both the theoretical and applied perspectives. Statistical learning theory has been developed in the 70’s and brought a great revival in statistics. On the one hand, the development of computer science and tools allowed massive data collection, and implementation of powerful algorithms which are often memory and computation time consuming. On the other hand, classical asymptotic theory used to prove the efficiency of estimation methods in modeling and prediction was limited by dimensionality problems.
         The approaches developed in statistical learning helped to face some new challenges such as the curse of dimensionality, small sample size, and now massive  datasets. Data mining, feature selection, and more recently Big data appeared as specialized approaches for massive datasets modeling and analysis.
         In addition to the theoretical developments (non asymptotic theory), many algorithms emerged in statistics and computer science to meet these new needs which arised in many areas such as bioinformatics, social sciences, medicine or telecommunications.
         This week aims to bring together specialists from statistical learning working on advanced techniques and coming from different fields, mainly statistics, but also computer science, social science and bioinformatics.
The main topics of interest include:

  • Supervised learning
  • Unsupervised learning: clustering and density estimation
  • Model selection
  • Big data

Scientific Committee

Gérard Biau (Université Pierre-et-Marie-Curie)
Pascal Massart (Université Paris-Sud)
Vincent Rivoirard (Université Paris-Dauphine)

Organizing Committee

Badih Ghattas (Aix-Marseille Université)
Liva Ralaivola (Aix-Marseille Université)


Large-scale machine learning and convex optimization

Apprentissage et données massives

Is adaptive early stopping possible in statistical inverse problems?

Entropy, geometry, and a CLT for Wishart matrices

The power of heterogeneous large-scale data for high-dimensional causal inference

Mixed integer programming for sparse and non convex machine learning

A Lagrangian viewpoint on Robust PCA

Block-diagonal covariance selection for high-dimensional Gaussian
graphical models

Statistical learning with Hawkes processes and new matrix concentration inequalities

Random forests variable importances: towards a better understanding and large-scale feature selection

About the Goldenshluger-Lepski methodology for bandwidth selection

Sub-Gaussian mean estimators

Reconstruction simpliciale de variétés via l’estimation des plans tangents

Quantization, Learning and Games with OPAC

Understanding (or not) Deep Neural Networks

Eigenvalue-free risk bounds for PCA projectors

Estimation  of  local  independence  graphs  via  Hawkes  processes  and link with the  functional neuronal connectivity

A novel multi resolution framework for the statistical analysis of ranking data

Robust sequential learning with applications to the forecasting of electricity consumption and of exchange rates

Oracle inequalities for network models and sparse graphon estimation