Prof. Yi Ma Featured on Machine Learning Street Talk

spot

14 DEC 2025

Prof. Yi Ma Featured on Machine Learning Street Talk

Prof. Yi Ma, Director of the HKU School of Computing and Data Science (HKU-CDS), was recently invited to the acclaimed podcast Machine Learning Street Talk (MLST) for an in-depth conversation. In the episode, Prof. Ma explored a big question—the nature of intelligence—while also unpacking the central ideas of his new book, Learning Deep Representations of Data Distributions.

black spot
image

Prof. Yi Ma, Director of the HKU School of Computing and Data Science (HKU-CDS), was recently invited to the acclaimed podcast Machine Learning Street Talk (MLST) for an in-depth conversation. In the episode, Prof. Ma explored a big question—the nature of intelligence—while also unpacking the central ideas of his new book, Learning Deep Representations of Data Distributions.


The book tackles a foundational challenge behind modern AI: although real-world observations are often high-dimensional, the predictable part of the world typically lives in low-dimensional structures. From this perspective, intelligence—at least the kind shared by animals and humans—is essentially about building a memory / world model: extracting what is predictable from the environment and using it for prediction and decision-making. In the interview, Prof. Ma summarized this goal with two key principles: parsimony (“as simple as possible, but not simpler”) and self-consistency—the idea that memory must be coherent enough to reconstruct or simulate the world in order to support reliable prediction.


With that goal in mind, the book proposes a unifying lens: many classical and modern approaches—ranging from PCA and dictionary learning to CNNs and Transformers—are not fundamentally disconnected. Under different constraints, they are all working toward the same objective: learning low-dimensional structure as compactly as possible and turning it into useful representations (which can be viewed as “knowledge” or “memory”). The book uses the language of compression to place dimensionality reduction, denoising, and representation learning within one coherent framework, aiming to bridge analytical models and today’s data-driven architectures.


More importantly, the book argues for a reversal in research mindset (or, in a sense, a return to the spirit of the 1940s): ideas often discussed vaguely as inductive bias are reformulated as testable first principles, and the book explores how one might derive network-layer operators from those principles. Prof. Ma gave an example in the interview: when a task must respect symmetries such as translation, convolution is not an arbitrary engineering choice—it can emerge naturally from the principle of “compression while respecting symmetry.” Along similar lines, his team has been exploring principle-driven ways to understand (and potentially derive) why architectures like Transformers, ResNets, and MoE have “survived” and continued to scale.


In Prof. Ma’s view, what truly generalizes in intelligence should not be mistaken for simply “accumulating enough knowledge.” The core is a mechanism that can continuously correct memory: a closed loop that compares prediction with observation, uses error signals to self-correct, and supports continual learning—even lifelong learning. Crucially, the existence of low-dimensional structure in the external world is what makes this kind of closed-loop learning possible.


This book’s publication model is also worth highlighting. Instead of a one-time, fixed print release, it is published in an open format and updated continuously online. The source and version history are maintained on GitHub, so readers can not only access it freely but also join the discussion and co-create through the community—allowing the book to evolve through open collaboration.


Overall, the MLST conversation offers a clear glimpse of what Prof. Ma’s book is aiming for: an explanation framework for deep learning that feels closer to theoretical physics than to “alchemy.” By using parsimony and self-consistency to connect “representation–memory–prediction–learning” into a closed loop—and by grounding architecture design in derivable principles—the book seeks to provide a stronger academic foundation for the next generation of AI systems: more interpretable, more verifiable, and capable of sustained evolution.


To learn more, you can watch the full interviews and the books via the links below!

Machine Learning Street Talk: https://www.youtube.com/watch?v=QWidx8cYVRs

Learning Deep Representations of Data Distributions: https://ma-lab-berkeley.github.io/deep-representation-learning-book/

Knowledge Exchange

All donations to the Student Emergency Fund will directly support our students as they adapt to changing circumstances.

Alumni

All donations to the Student Emergency Fund will directly support our students as they adapt to changing circumstances.

Giving to CDS

All donations to the Student Emergency Fund will directly support our students as they adapt to changing circumstances.

Alumni

All the Lorem Ipsum generators on the Internet tend to repeat predefined chunks as necessary, making this the first true generator on the Internet.

All the Lorem Ipsum generators on the Internet tend to repeat predefined chunks as necessary, making this the first true generator on the Internet.