Scientific Committee & Organizing Committee
Comité scientifique & Comité d’organisation
Olga Klopp (ESSEC and CREST)
Christophe Pouet (École Centrale de Marseille)
Alexander Rakhlin (MIT)
The first conference of the cycle is dedicated to the potential of the statistical approach in accompanying the development of our data-driven society. Our goal is to encourage collaboration and knowledge sharing between theoretical computer scientists and mathematical statisticians by bringing them together in Luminy. Both communities possess unique visions, skills and expertise. We hope that merging these strengths will advance the field and bring us closer to solving some of the key challenges such as (i) robustness, (ii) fairness and (iii) privacy of decision-making algorithms.
(i) Robustness: As Machine learning algorithms are increasingly used to support human decisions, it is important to make them robust. It is not sufficient to achieve reasonable performance on a hold-out dataset; predictive power should be also retained when circumstances are subject to reasonable changes. Statistical reasoning and tools (for example, can we have “good enough” performance 99% of the time; can we be confident in our predictions; how confident are our predictions?) is important in leveraging this challenge. While the mathematical statisticians mainly focused on theoretically optimal robust algorithms the theoretical computer science community contributed to our understanding of computational tractability of robust procedures although many obtained results significantly rely on specific model assumptions. Our meeting intends to give an excellent opportunity for these two communities to exchange about recent advances on the topic of robustness and enrich themselves.
(ii) Fairness: Decision makers may be subject to many forms of prejudice and bias. Can we hope that machines would be able to make more equitable decisions? Unfortunately, there are concerns that some of these data driven systems treat members of different classes differently, perpetrating biases, providing different degrees of utilities, and inducing disparities. While an “algorithm” may be automatic, following prescribed rules, and will apply an identical recipe to everyone, there is a growing understanding that biased data collection yields biased results: when more data is available from a particular social group, algorithms are likely to do better for this group, which can in turn lead to a vicious cycle of minority group abandonment, yielding ever more bias. The statistical community have just begun to pick up these challenges here. On the other hand, researchers in machine learning have begun to develop properties algorithms should satisfy to guarantee “equitable treatment.” While fairness is context dependent and may mean different things to different people, a sequence of recent works has given rise to a useful vocabulary for discussing fairness in automated systems.
(iii) Privacy: Numerous high-profile failures of privacy highlight the challenges of large-scale data analyses. Nowadays, massive amounts of data, such as medical records, smartphone user behavior or social media activity, are routinely being collected and stored. On the other side of this trend is an increasing reluctance and discomfort of individuals to share this sometimes sensitive information with companies or state officials. Over the last few decades, the problem of constructing privacy preserving data release mechanisms has produced a vast literature, predominantly in computer science. Our meeting intents to provide a platform for exchanges between computer science and statistics communities on the topic of privacy protection.
to be announced
to be announced