许方园(副教授)

硕士生导师

所在单位:自动化学院

学历:博士

联系方式:邮箱:datuan12345@hotmail.com

在职信息:在职

   
当前位置: 中文主页 >> 科学研究 >> 论文成果

A robust correlation analysis framework for imbalanced and dichotomous data with uncertainty(SCI二区:2021年影响因子:5.910)

点击次数:

影响因子:5.91

DOI码:10.1016/j.ins.2018.08.017

发表刊物:INFORMATION SCIENCES

关键字:Pearson product-moment correlation; Imbalanced data; Clearness index; Dichotomous variable

摘要:Correlation analysis is one of the fundamental mathematical tools for identifying dependence between classes. However, the accuracy of the analysis could be jeopardized due to variance error in the data set. This paper provides a mathematical analysis of the impact of imbalanced data concerning Pearson Product Moment Correlation (PPMC) analysis. To alleviate this issue, the novel framework Robust Correlation Analysis Framework (RCAF) is proposed to improve the correlation analysis accuracy. A review of the issues due to imbalanced data and data uncertainty in machine learning is given. The proposed framework is tested with in-depth analysis of real-life solar irradiance and weather condition data from Johannesburg, South Africa. Additionally, comparisons of correlation analysis with prominent sampling techniques, i.e., Synthetic Minority Over-Sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN) sampling techniques are conducted. Finally, K-Means and Wards Agglomerative hierarchical clustering are performed to study the correlation results. Compared to the traditional PPMC, RCAF can reduce the standard deviation of the correlation coefficient under imbalanced data in the range of 32.5%-93.02%. (C) 2018 Elsevier Inc. All rights reserved.

备注:SCI二区:2021年影响因子:5.910

合写作者:Yingshan Tao,Youwei Jia,Haoliang Yuan,Chao Huang,Zhao Xu,Giorgio Locatelli

第一作者:Chun Sing Lai

论文类型:期刊论文

通讯作者:Fangyuan Xu,Wing W. Y. Ng,Loi Lei Lai

文献类型:J

卷号:470

页面范围:58-77

ISSN号:0020-0255

是否译文:

发表时间:2019-01-01

收录刊物:SCI

发布期刊链接:https://www.sciencedirect.com/science/article/pii/S0020025518306224?via%3Dihub

上一条: Virtual Storage-Based DSM With Error-Driven Prediction Modulation for Microgrids(SCI二区:2021年影响因子:3.745) 下一条: Resilience-Constrained Hourly Unit Commitment in Electricity Grids(SCI二区:2021年影响因子:5.910)