报告题目:Ensemble Projection Pursuit for General Nonparametric Regression
报告人:新加坡国立大学夏应存教授
报告时间:2023年7月14日上午11:00-12:00
报告地点:js金沙3983总站学院会议室80602
报告摘要:The projection pursuit regression (PPR) has played an important role in the development of statistics and machine learning. However, when compared to other established methods like random forests (RF) and support vector machines (SVM), PPR has yet to showcase a similar level of accuracy as a statistical learning technique. In this paper, we revisit the estimation of PPR and propose an optimal greedy algorithm and an ensemble approach via "feature bagging", hereafter referred to as ePPR, aiming to improve the efficacy. Compared to RF, ePPR has two main advantages. Firstly, its theoretical consistency can be proved for more general regression functions as long as they are L2 integrable, and higher consistency rates can be achieved. Secondly, ePPR does not split the samples, and thus each term of PPR is estimated using the whole data, making the minimization more efficient and guaranteeing the smoothness of the estimator. Extensive comparisons based on real data sets show that ePPR is more efficient in regression and classification than RF and other competitors. The efficacy of ePPR, as a variant of Artificial Neural Networks (ANN), demonstrates that with suitable statistical tuning, ANN can equal or even exceed RF in dealing with small to medium-sized datasets. This finding challenges the widespread belief that ANN's superiority over RF is limited to processing big data.
报告人简介:夏应存, 新加坡国立大学统计与数据科学系教授。研究兴趣包括非参数回归,高维数据分析,疾病传播统计建模等。研究成果发表在AOS, JASA, JRSSB, Biometrika, JOE, PNAS等期刊. Nature News等多个学术媒体对其提出的疾病跨域传播模型做了专题报道。 JRSSB,Statistical Science和Statistica Sinica对其论文进行了公开讨论。 夏教授曾在暨南大学工作多年, 荣获国务院侨办颁发的“优秀教师”称号。