3676 - conferences.cirm-math.fr

MULTIYEAR PROGRAM
CONFERENCE

Meeting in Mathematical Statistics
Rencontre de Statistiques Mathématiques

Mathematical statistics in the age of generative AI

14 – 18 December 2026

Scientific Committee
Comité scientifique

Alexandra Carpentier (University of Postdam)
Vladimir Koltchinskii (Georgia Tech)
Alexander Goldenshluger (University of Haifa)
Yurii Polyanskiy (Massachusetts Institute of Technology)
Markus Reiß (Humboldt University of Berlin)

Organizing Committee
Comité d’organisation

Pierre Bellec (Rutgers University)
Natalia Bockhina (University of Edinburgh)
Victor-Emmanuel Brunel (ENSAE CREST)
Christophe Pouet (Ecole Centrale Méditerranée)

IMPORTANT WARNING: Scam / Phishing / SMiShing ! Note that ill-intentioned people may be trying to contact some of participants by email or phone to get money and personal details, by pretending to be part of the staff of our conference center (CIRM). CIRM and the organizers will NEVER contact you by phone on this issue and will NEVER ask you to pay for accommodation/ board / possible registration fee in advance. Any due payment will be taken onsite at CIRM during your stay.

We plan to dedicate the 2026- 2028 series of conferences to challenges and emerging topics in the area of mathematical statistics driven by the adventure of artificial intelligence.

The first conference of the cycle is dedicated to the statistics of generative artificial intelligence. Generative modeling, the automatic generation of examples such as texts, images, music, and molecules that are similar to those in a given dataset, is a central task in artificial intelligence, especially with the recent rise of large language models. Mathematically, this task is framed as the problem of sampling from an unknown distribution, which is accessible only through a limited set of examples drawn from it. The size and quality of this set of available samples can vary greatly depending on the application. The algorithms that have propelled generative modeling to fame are known for their substantial data and computational resource requirements, often necessitating vast amounts of both to achieve state-of-the-art performance. The goal of this first conference is to investigate the mathematical properties of generative modeling algorithms to better understand their strengths and weaknesses, enhance their efficiency, and design new methods. The mathematical challenge in generative modeling lies in successfully integrating techniques from various areas of mathematical statistics and probability theory: dimension reduction, nonparametric estimation, manifold learning, sampling, optimal transport, stochastic calculus, among others. Investigating the mathematical properties of this pipeline requires a deep analysis of these methods and their interactions to solve the overarching problem. Such analysis is key to exploring multiple facets of generative modeling algorithms, including precision, robustness, creativity, and computational tractability. Our meeting intents to provide a platform for exchanges between computer science and statistics communities on the topic of generative AI.

Nous prévoyons de consacrer la série de conférences 2026-2028 aux défis et aux thèmes émergents dans le domaine des statistiques mathématiques, sous l’impulsion de l’aventure de l’intelligence artificielle.

La première conférence du cycle est consacrée aux statistiques de l’intelligence artificielle générative. La modélisation générative, c’est-à-dire la génération automatique d’exemples tels que des textes, des images, de la musique et des molécules similaires à ceux d’un ensemble de données donné, est une tâche centrale de l’intelligence artificielle, en particulier avec l’essor récent des grands modèles linguistiques. Mathématiquement, cette tâche s’apparente au problème de l’échantillonnage à partir d’une distribution inconnue, accessible uniquement à travers un ensemble limité d’exemples tirés de celle-ci. La taille et la qualité de cet ensemble d’échantillons disponibles peuvent varier considérablement en fonction de l’application. Les algorithmes qui ont propulsé la modélisation générative vers la notoriété sont connus pour leurs besoins importants en données et en ressources informatiques, nécessitant souvent des quantités considérables des deux pour atteindre des performances de pointe. L’objectif de cette première conférence est d’étudier les propriétés mathématiques des algorithmes de modélisation générative afin de mieux comprendre leurs forces et leurs faiblesses, d’améliorer leur efficacité et de concevoir de nouvelles méthodes. Le défi mathématique de la modélisation générative réside dans l’intégration réussie de techniques issues de divers domaines des statistiques mathématiques et de la théorie des probabilités : réduction de dimension, estimation non paramétrique, apprentissage des variétés, échantillonnage, transport optimal, calcul stochastique, entre autres. L’étude des propriétés mathématiques de ce pipeline nécessite une analyse approfondie de ces méthodes et de leurs interactions afin de résoudre le problème global.