报告题目:Mathematical Theory of Deep Convolutional Neural Networks
报告时间:2019年11月28日(星期四)16:00-17:00
报告地点:第7教学楼101
主讲嘉宾:周定轩 教授(香港城市大学)
主办单位:0638太阳集团官网
报告人简介:
Ding-Xuan Zhou is a faculty member of City University of Hong Kong, serving as a Chair Professor in School of Data Science and Department of Mathematics, Associate Dean of School of Data Science, and Director of the Liu Bie Ju Centre for Mathematical Sciences. He received his BSc and Ph.D degrees in mathematics in 1988 and 1991, respectively, from Zhejiang University, Hangzhou, China. His research interests include deep learning, learning theory, data science, wavelet analysis and approximation theory. He has published over 100 journal papers, is serving on editorial boards of more than ten international journals such as Applied and Computational Harmonic Analysis, Journal of Approximation Theory, Journal of Complexity, Econometrics and Statistics, Communications on Pure and Applied Analysis, and Frontiers in Mathematics of Computation and Data Science. He is an Editor-in-Chief of the journals ``Analysis and Application'' and “Mathematical Foundations of Computing”, and the book series “Progress in Data Science”. He was rated in 2014-2017 by Thomson Reuters/Clarivate Analytics as a Highly-cited Researcher.
报告摘要:
Deep learning has been widely applied and brought breakthroughs in speech recognition, computer vision, and many other domains. The involved deep neural network architectures and computational issues have been well studied in machine learning. But there lacks a theoretical foundation for understanding the modelling, approximation or generalization ability of deep learning models with network architectures such as deep convolutional neural networks (CNNs) with convolutional structures. The convolutional architecture gives essential differences between the deep CNNs and fully-connected deep neural networks, and the classical theory for fully-connected networks developed around 30 years ago does not apply. This talk describes a mathematical theory of deep CNNs associated with the rectified linear unit (ReLU) activation function. In particular, we give the first proof for the universality of deep CNNs, meaning that a deep CNN can be used to approximate any continuous function to an arbitrary accuracy when the depth of the neural network is large enough. We also give explicit rates of approximation, and show that the approximation ability of deep CNNs is at least as good as that of fully-connected multi-layer neural networks. Our quantitative estimate, given tightly in terms of the number of free parameters to be computed, verifies the efficiency of deep CNNs in dealing with large dimensional data.