3 Jul 2025
HKU CDS Distinguished Lecture Series: Navigate the Crossroad of Statistics, Generative AI and Genomic Health
The Distinguished Lecture Series, hosted by the School of Computing and Data Science (CDS), brings leading scholars from around the world to share their cutting-edge research and insights in the fields of computer science, data science, artificial intelligence, and statistics.

The Distinguished Lecture Series, hosted by the School of Computing and Data Science (CDS), brings leading scholars from around the world to share their cutting-edge research and insights in the fields of computer science, data science, artificial intelligence, and statistics.
We are pleased to announce an upcoming lecture in the series, titled “Navigate the Crossroad of Statistics, Generative AI and Genomic Health”, will be presented by Professor Xihong Lin, Department of Biostatistics and Department of Statistics, Harvard University.
Speaker:
Professor Xihong Lin, Department of Biostatistics and Department of Statistics, Harvard University
Date:
9 July 2025 (Wednesday)
Time:
10:00 am – 11:00 am
Venue:
KK202, 2/F, K.K. Leung Building, Main Campus, The University of Hong Kong
Abstract:
Scalable and robust statistical methods empowered by generative AI offer unprecedent potentials for trustworthy science as they quantify uncertainty, enhance interpretability, and accelerate scientific discovery. In this talk, I will discuss the challenges and opportunities as we navigate the crossroad of statistics, generative AI, and genomic health science. I will discuss robust and powerful statistical analysis by leveraging synthetic data generated by generative AI models, such as diffusion models and transformer, while ensuring valid statistical inference when generative AI models are misspecified. I will illustrate key points using the analysis of large-scale biobanks, whole genome sequencing data, and electronic health records, and demonstrate the power of scientific discovery by integrating statistics and generative AI using synthetic data. I will also discuss how to conduct scalable and interpretable large-scale whole genome sequencing (WGS) data, and illustrate the WGS analysis ecosystem using the TOPMed WGS samples of 200,000, the UK biobank of 500,000 subjects in the cloud platform RAP and as well the All of Us data of 400,000 subjects in the NIH cloud platform AnVIL.
Biography:
Xihong Lin is Professor and former Chair of Biostatistics, and Coordinating Director of the Program in Quantitative Genomics at Harvard School of Public Health, and Professor and Chair of Statistics at Harvard University. Dr. Lin works on the development and application of statistical and machine learning methods for the analysis of massive and complex genomic and health data. Dr. Lin is an elected member of the US National Academy of Sciences and the US National Academy of Medicine. She received the Presidents’ Award from the Committee of Presidents of Statistical Societies (COPSS), the Mortimer Spiegelman Award from the American Public Health Association, and the Outstanding Investigator Award from the National Cancer Institute. She is an elected fellow of American Statistical Association, Institute of Mathematical Statistics, and International Statistical Institute.
All are welcome to attend.