Scientific Committee & Organizing Committee
Comité scientifique & Comité d’organisation
Olga Klopp (ESSEC, CREST)
Mohamed Ndaoud (ESSEC)
Christophe Pouet (École Centrale de Marseille)
Alexander Rakhlin (MIT)
IMPORTANT WARNING: Scam / Phishing / SMiShing ! Note that ill-intentioned people may be trying to contact some of participants by email or phone to get money and personal details, by pretending to be part of the staff of our conference center (CIRM). CIRM and the organizers will NEVER contact you by phone on this issue and will NEVER ask you to pay for accommodation/ board / possible registration fee in advance. Any due payment will be taken onsite at CIRM during your stay.
We plan to dedicate the 2023 – 2025 series of conferences to challenges and emerging topics in the area of mathematical statistics driven by the adventure of artificial intelligence. Tremendous progress has been made in building up powerful machine learning algorithms such as random forests, gradient boosting or neural networks. These models are exceptionally complex and difficult to interpret but offer enormous opportunities in many areas of application going from science, public policies to business. These sophisticated algorithms are often called “black boxes” as they are very hard to analyze. The widespread use of such predictive algorithms raises extremely important questions of replicability, reliability, robustness or privacy protection. The proposed series of conferences is dedicated to new statistical methods built around these black-box algorithms that leverage their power but at the same time guarantee their replicability and reliability.
The second conference of the program will highlight recent theoretical advances in inference for high-dimensional statistical models based on the interplay of techniques from mathematical statistics, machine learning and theoretical computer science.
The importance of high-dimensional statistics is due to the increasing dimensionality and complexity of models needed to process and understand modern data. Meaningful inference about such models is possible assuming suitable lower dimensional underlying structure or low-dimensional approximations, for which the error can be reasonably controlled. Examples of such structures include sparse high dimensional regression, low rank matrix models, dictionary learning, network models, latent variable models, topic models, and others.
Nous prévoyons de consacrer la série de conférences 2023 à 2025 aux défis et aux sujets émergents dans le domaine des statistiques mathématiques motivés par les avancées récentes en intelligence artificielle. D’énormes progrès ont été réalisés dans la conception d’algorithmes d’apprentissage statistique puissants tels que les forêts aléatoires ou les réseaux de neurones. Ces modèles sont exceptionnellement complexes et difficiles à interpréter mais offrent d’énormes opportunités dans de nombreux domaines d’application allant de la science aux politiques publiques en passant par le business. Ces algorithmes sophistiqués sont souvent appelés « boîtes noires » car ils sont très difficiles à analyser. L’utilisation généralisée de tels algorithmes prédictifs soulève des questions extrêmement importantes de réplicabilité, de fiabilité, de robustesse ou de protection de la vie privée. La série de conférences proposée est dédiée aux nouvelles méthodes statistiques construites autour de ces algorithmes, dit boîte noire, qui exploitent leur puissance tout en garantissant leur réplicabilité ainsi que leur fiabilité.
La deuxième conférence du programme mettra en lumière les récents progrès théoriques en matière d’inférence pour des modèles statistiques de grande dimension basés sur l’interaction de techniques issues des statistiques mathématiques, de l’apprentissage automatique ainsi que de l’informatique.
L’importance des statistiques en grande dimension est due à la dimensionnalité et à la complexité croissantes des modèles nécessaires au traitement et à la compréhension des données modernes. Une inférence significative sur de tels modèles est possible en supposant une structure sous-jacente de dimension inférieure appropriée ou des approximations de faible dimension, pour lesquelles l’erreur peut être raisonnablement contrôlée. Des exemples de telles structures incluent la régression parcimonieuse, les modèles matriciels de faible rang, l’apprentissage de dictionnaires, les modèles de réseau, les modèles à variables latentes, les modèles thématiques parmi d’autres.
TUTORIALS
Johannes Schmidt-Hieber (University of Twente) : Statistical theory for biologically inspired learning
« Compared to artificial neural networks (ANNs), the brain seems to learn faster, generalize better to new situations and consumes much less energy. ANNs are motivated by the functioning of the brain but differ in several crucial aspects. While ANNs are deterministic, biological neural networks (BNNs) are stochastic. Moreover, it is biologically implausible that the learning of the brain is based on gradient descent. In the past years, statistical theory for artificial neural networks has been developed. The idea now is to extend this to biological neural networks, as the future of AI is likely to draw even more inspiration from biology. In this lecture series we will survey the challenges and present some first statistical risk bounds for different biologically inspired learning rules. »
Rina Foygel Barber (University of Chicago) : An introduction to conformal prediction and distribution-free inference
« This two-part tutorial will introduce the framework of conformal prediction, and will provide an overview of both theoretical foundations and practical methodologies in this field. In the first part of the tutorial, we will cover methods including holdout set methods, full conformal prediction, cross-validation based methods, calibration procedures, and more, with emphasis on how these methods can be adapted to a range of settings to achieve robust uncertainty quantification without compromising on accuracy. In the second part, we will cover some recent extensions that allow the methodology to be applied in broader settings, such as weighted conformal prediction, localized methods, online conformal prediction, and outlier detection. »
SPEAKERS
Fairness :
Christophe Giraud (Université Paris-Saclay)
Solenne Gaucher (ENSEA – CREST)
Jean-Michel Loubes (Université de Toulouse III)
Nicolas Schreuder (CNRS, Université Gustave Eiffel))
Generative modeling :
Arnak Dalalyan (ENSAE – CREST)
Andrej Risteski (Carnegie Mellon University)
High dimensional covariance estimation :
Zhao Ren (University of Pittsburgh)
Angelika Rohde (University of Freiburg)
High dimensional testing :
Subhodh Kotekal (University of Chicago)
Yuhao Wang (Tsinghua University, Shanghai Qi Zhi Institute)
Learning theory :
Jaouad Mourtada (ENSAE-CREST)
Daniil Tiapkin (Ecole Polytechnique)
Privacy :
Tom Berrett (University of Warwick)
Vianney Perchet ( ENSAE-CREST)
Robustness :
Marco Avella Medina (Colombia University)
Chao Gao (University of Chicago)
Alexey Kroshnin (Weierstrass Institute)
Guillaume Lecué (ENSAE)
Arshak Minasyan (ENSAE-CREST)
Stanislav Minsker (University of Southern California)
Round table on fairness and robustness in AI :
Boris Noumedem Djieuzem (BNP PARIBAS)
Omar Alonso Doria Arrieta (BNP PARIBAS)
Jean-Michel Loubes (Université de Toulouse III)
SDE diffusions :
Marc Hoffmann (Université Paris-Dauphine)
Mark Podolskij (University of Luxembourg)
Markus Reiß (Humboldt University of Berlin)
Uncertainty quantification :
Eduard Belitser (Free University of Amsterdam)
Pierre Bellec (Rutgers University)
Maxim Panov (Mohamed bin Zayed University of Artificial Intelligence)