Secure, safe and fair machine learning for healthcare
Coordinating Partners : Jamal Atif and Aurélien Bellet
Coordinating institution : Paris Sciences et Lettres University (PSL)
Machine learning, massive health data, trust, cybersecurity, federated learning, confidentiality, robustness, fairness
The healthcare sector (public and private) generates a vast amount of data from various sources, including electronic health records, advanced imaging techniques, high throughput sequencing, wearable devices, and population health data. The use of massive datasets, or “big data,” analyzed using sophisticated machine learning algorithms, has the potential to inform the development of more effective and personalized treatments, interventions, and policies, and to improve healthcare delivery and outcomes.However, the sensitive nature of personal health data, cybersecurity risks, biases in the data, and the lack of robustness of machine learning algorithms are all factors that currently limit the widespread use and exploitation of this data. These limitations thus hinder the potential benefits that can be obtained from massive health data analysis for the individuals and society.
Health data usage is governed by a complex and extensive set of ethical and legal requirements. Ensuring the security of data, regardless of its nature and how it is transmitted, processed, and transformed, is essential. At the same time, the methods used to analyze and utilize this data must also be secure, fair, and robust. This is particularly important in the face of the growing number of cyber-attacks on the healthcare sector mostly driven by the personal, economic, and innovative value of medical data and their processing.
The goal of this project is to overcome the challenges that prevent the effective use of personalized health data. To achieve this, we will develop new machine learning algorithms that are designed to handle the unique characteristics of multi-scale and heterogeneous individual health data, while providing formal privacy guarantees, robustness against adversarial attacks and changes in data dynamics, and fairness for under-represented populations. By addressing these barriers, we hope to unlock the full potential of personalized health data for a wide range of applications.
More precisely, the project will address the following scientific challenges:
- privacy-preserving learning: new results in differential privacy and homomorphic encryption;
- federated vs centralized learning: new methods and trade-off accuracy/privacy;
- Robustness: data bias, non-stationarity, model drift, data shift, domain adaptation, new attacks, new defenses;
- Machine un-learning or the right to be forgotten.
The project brings together a consortium of established researchers with expertise in machine learning, statistics, privacy, and robustness and biomedical applications, and a clear commitment to unlock the full potential of machine learning for healthcare applications. Its innovative character lies in its ability to mobilize such a unique community of researchers. Moreover, the project is positioned between two PEPRs (Cybersecurity and Digital Health), giving it a particular character of dissemination of knowledge and practices between research communities that have had little place for interaction so far.
Laboratory or department, team | Supervisors |
LAMSADE – UMR7243 | CNRS, Paris-Dauphine-PSL University |
LaTIM – U 1101 – Eq Cyber Health | Inserm, Bretagne Occidentale University, IMT Atlantique
CHU Brest partner |
Inria Sophia Antipolis, Eq PreMeDICaL | Inria, University of Montpellier |
CEA LIST – Dpt DIN et DSCIN | CEA, Paris-Saclay University |
DI-ENS – UMR 8548 | CNRS, ENS, PSL, Inria |
Inria Sophia Antipolis, Eq EPIONE | Inria, Côte d’Azur University |
CITI Lab – Eq Inria Privatics | INSA, Inria Grenoble, Inria Lyon, Lyon University |