As an interdisciplinary field, data science integrates statistics, computer science, and domain knowledge with the goal of extracting valuable insights and patterns from massive datasets. Its core methodology encompasses data collection, cleaning, transformation, analysis, and visualization. By applying advanced techniques such as machine learning and deep learning, data science constructs predictive models to support decision-making. The professionalism of the discipline is reflected in its rigorous adherence to data ethics, careful evaluation of algorithmic models, and emphasis on interpretability and reproducibility of results—ensuring that data-driven conclusions are both reliable and fair.

The undergraduate research seminar, supervised by Professor Zhao Xiaofei, focuses on the mathematical foundations of data science and the frontiers of operator learning. It aims to provide undergraduates with a platform to deepen their understanding of core theories and advanced techniques in data science, fostering their interest in research and ability to innovate.
In the first seminar, Wen Zhenye, a master’s student in Computational Mathematics (Class of 2023), delivered a presentation titled “Core Framework and Methodology of Statistical Learning.” Wen systematically explained the fundamental concepts of statistical learning, gave a detailed account of the two main categories—supervised and unsupervised learning—and explored in depth the relationships among predictive functions, loss functions, overfitting, and the trade-off between model complexity and estimation accuracy. In addition, he introduced the Bayesian learning framework, regularization methods, and model selection criteria (AIC, BIC), and elaborated on advanced topics such as multivariate normal models, Monte Carlo methods, and the comparison and selection of Bayesian models. His report covered risk minimization and model evaluation strategies, effectively aiding participants in understanding both the theoretical framework and practical applications of statistical learning.
The second seminar was presented by Chang Zhipeng, a doctoral student in Computational Mathematics (Class of 2022), under the theme “Fundamental Principles and Methodological Framework of Operator Learning.” Chang began by outlining the research background of operator learning, emphasizing its critical role in handling high-dimensional and complex physical systems. He provided an in-depth explanation of how operator learning models input and output as functions or fields, distinguishing it from traditional point-to-point learning. He also gave a detailed introduction to representative models such as DeepONet and the Fourier Neural Operator (FNO), discussing approaches to data acquisition, mechanisms for encoding input functions, and consideration of generalization ability in loss function design. Furthermore, he demonstrated the application of operator learning to multiphysics coupling problems and data-driven PDE solutions, highlighting its advantages in both speed and accuracy.

Across the two seminars, the presenters’ content was substantial and logically coherent, reflecting a strong theoretical foundation and deep insight into research frontiers. After each presentation, Professor Zhao Xiaofei provided comments and supplementary discussion, engaging with the speakers and participants on topics such as practical application, model interpretability, and scalability.
This seminar series not only helped undergraduates systematically study the core concepts, methodologies, and cutting-edge technologies of data science and operator learning but also ignited their enthusiasm for research in related fields. Through interaction and exchange with senior graduate students and the supervising professor, participants further clarified their research directions, enhanced their scientific research capabilities, and established a solid foundation for future exploration in the field of data science.