**MULTIYEAR PROGRAM**

**VIRTUAL CONFERENCE**

**Meeting in Mathematical Statistics**

*Rencontres de Statistique Mathématique*

**Robustness and computational efficiency of algorithms in statistical learning****14 - 18 December 2020**

Scientific Committee & Organizing CommitteeComité scientifique & Comité d'organisationCristina Butucea (CREST, ENSAE, Université Paris-Est Marne-la-Vallée) Stanislav Minsker (University of Southern California) Christophe Pouet (Ecole Centrale de Marseille) Vladimir Spokoiny (Humboldt University of Berlin) |

**Description**

Noisy and corrupted measurements, commonly encountered as a result of unsupervised, automated data collection process, require new algorithms that produce reliable outcomes in this challenging framework. Discussing the influence of outliers on statistical procedures, P. Huber observed that "...the naturally occurring deviations from the idealized model are large enough to render meaningless the traditional asymptotic optimality theory." It is known that heuristic outlier removal procedures are bias-producing and lack rigorous justification (or require strong assumptions), as it is sometimes impossible to determine if an extreme observation appears due to an error, or is a feature of the data-generating mechanism. This motivates the study of robust estimators in the context of statistical learning theory.

Our goal is to encourage collaboration and knowledge sharing between theoretical computer scientists and mathematical statisticians by bringing them together at Luminy. Both communities possess unique visions, skills and expertise, we hope that merging these strength will advance the field and bring us closer to solving some of the key challenges.

Our goal is to encourage collaboration and knowledge sharing between theoretical computer scientists and mathematical statisticians by bringing them together at Luminy. Both communities possess unique visions, skills and expertise, we hope that merging these strength will advance the field and bring us closer to solving some of the key challenges.

**Lectures**

abstract_saravandegeer.pdf | |

File Size: | 142 kb |

File Type: |

*Statistical Optimality and Algorithms for Top-K and Total Ranking*

Abstract: In this short course, we will discuss the problem of ranking with partially observed pairwise comparisons in the setting of Bradley-Terry-Luce (BTL) model. There are two fundamental problems: 1) top-K ranking, which is to select the set of K players with top performances; 2) total ranking, which is to rank the entire set of players. Both ranking problems find important applications in web search, recommender systems and sports competition.

In the first presentation, we will consider the top-K ranking problem. The statistical properties of two popular algorithms, MLE and rank centrality (spectral ranking) will be precisely characterized. In terms of both partial and exact recovery, the MLE achieves optimality with matching lower bounds. The spectral method is shown to be generally sub-optimal, though it has the same order of sample complexity as the MLE. Our theory also reveals the essentially only situation when the spectral method is optimal. This turns out to be the most favorable choice of skill parameter given the separation of the two groups.

The second presentation will be focused on total ranking. The problem is to find a permutation vector to rank the entire set of players. We will show that the minimax rate of the problem with respect to the Kendall’s tau loss exhibits a transition between an exponential rate and a polynomial rate depending on the signal to noise ratio of the problem. The optimal algorithm consists of two stages. In the first stage, games with very high or low scores are used to partition the entire set of players into different leagues. In the second stage, games that are very close are used to rank the players within each league. We will give intuition and some analysis to show why the algorithm works optimally.

**Talks**

Felix Abramovich (Tel Aviv University)

*High-dimensional classification by sparse logistic regression*

Pierre Bellec (Rutgers University)

*Out-of-sample error estimate for robust M-estimators with convex penalty*

Clément Berenfeld (Université Paris-Dauphine)

*Density estimation on manifolds*

Thomas Berrett (University of Warwick)

*Locally private non-asymptotic testing of distributions is faster using interactive mechanisms*

Natalia Bochkina (University of Edinburgh)

*Bernstein-von Mises theorem for the scale hyperparameter in inverse problems with a Gaussian prior*

Alexandra Carpentier (OvGU Magdeburg)

*Several structured thresholding bandit problem*

Julien Chhor (CREST-ENSAE)

*Goodness-of-fit testing for multinomials and densities: sharp local minimax rates*

Fabienne Comte (Université Paris Descartes)

*Nonparametric estimation for i.i.d. Gaussian continuous time moving average models*

Alexis Derumigny (University of Twente)

*On lower bounds for the bias-variance trade-off*

Motonobu Kanagawa (EURECOM)

*On the connections and equivalences between Gaussian processes and kernel methods in nonparametric regression*

Avetik Karagulyan (ENSAE/CREST)

*Penalized Langevin dynamics with vanishing penalty for smooth and log-concave targets*

Clément Marteau (Université Lyon 1)

*SuperMix: Sparse regularization for mixtures*

Mohamed Simo Ndaoud (University of Southern California)

*Robust and efficient mean estimation: approach based on the properties of self-normalized sums*

Tuan-Binh Nguyen (Universite Paris-Sud (Paris-Saclay)

*Aggregation of Multiple Knockoffs*

Marianna Pensky (University of Central Florida)

*Statistical Inference in Popularity Adjusted Stochastic Block Model*

Vianney Perchet (ENSAE & Criteo AI Lab)

*Robustness of Community Detection to Random Geometric Perturbations*

Philippe Rigollet (MIT)

*Minimax Coreset Density Estimation*

Vincent Rivoirard (Université Paris-Dauphine)

*Nonparametric Bayesian inference for Hawkes processes*

Angelika Rohde (A. Ludwigs University Freiburg)

*Interactive versus non-interactive locally, differentially private estimation: Two elbows for the quadratic functional*

Richard Samworth (University of Cambridge)

*Adaptive transfer learning*

Johannes Schmidt-Hieber (University of Twente)

*The Kolmogorov-Arnold theorem revisited*

Suzanne Sigalla (CREST-ENSAE)

*Improved clustering algorithms for the Bipartite Stochastic Block Model*

Zoltan Szabo (Ecole Polytechnique)

*Kernel Machines with Hard Shape Constraints*

Nicolas Verzelen (INRAE Montpellier)

*Optimal Change-Point Detection and Localization*

Nikita Zhivotovsky (Google Research)

*Robust k-means clustering for distributions with two bounded moments*

**SPONSORS**