Artificial Neural Network (ANN) and Machine Learning (ML) have received a huge amount of attention among
physicists triggered by the 2024 Nobel Physics Prize to John Hopfield and Geoff Hinton. Historically, ANN-based ML was
developed based on ideas from neuroscience and statistical physics. However, the recent successes of deep learning neural
networks (DLNN) are mainly driven by large amount of data and exponentially growing computing capability. As a result,
despite its successes in many different disciplines, DLNN remains largely a black box – it is unclear how it learns and whether
what it learns is generalizable. In the past several years, we have been trying to develop a theoretical framework based on
statistical physics and stochastic dynamical systems theory to study DLNN. In this talk, we will first give a broad introduction of
the ANN-based ML emphasizing on the role of physics in its development followed by describing some of our recent progress in
understanding DLNN in two related areas – learning dynamics and generalization.
Yuhai Tu graduated from the School of Gifted Young at University of Science and Technology of China in 1987. He came to the
US under the CASPEA program and received his PhD in Theoretical Physics from UCSD in 1991. After 3 years as the Division
Prize Fellow at Caltech, he joined IBM Watson Research Center in 1994 and served as head of the theory group during 2002-2015.
In 2025, he joined Flatiron Institute as a senior research scientist.