DOI: 10.1016/j.atmosenv.2015.02.042
Scopus记录号: 2-s2.0-84923186000
论文题名: A clustering algorithm for sample data based on environmental pollution characteristics
作者: Chen M ; , Wang P ; , Chen Q ; , Wu J ; , Chen X
刊名: Atmospheric Environment
ISSN: 0168-2563
EISSN: 1573-515X
出版年: 2015
卷: 107 起始页码: 194
结束页码: 203
语种: 英语
英文关键词: Clustering algorithm
; Environmental pollution
; High-dimensional sample data
; Pollution characteristics
Scopus关键词: Algorithms
; Classification (of information)
; Iterative methods
; Pollution
; Clustering process
; Environmental pollutions
; Pollution models
; Pollution sources
; Sample data
; Similarity functions
; Source apportionment
; Synthetic datasets
; Clustering algorithms
; accuracy assessment
; algorithm
; cluster analysis
; concentration (composition)
; data set
; environmental assessment
; model validation
; outlier
; pollutant source
; pollution monitoring
; Article
; classification algorithm
; cluster analysis
; concentration (parameters)
; epc algorithm
; information processing
; k nearest neighbor
; measurement accuracy
; pollution
; pollution monitoring
; priority journal
; sample size
; sampling
; validity
Scopus学科分类: Environmental Science: Water Science and Technology
; Earth and Planetary Sciences: Earth-Surface Processes
; Environmental Science: Environmental Chemistry
英文摘要: Environmental pollution has become an issue of serious international concern in recent years. Among the receptor-oriented pollution models, CMB, PMF, UNMIX, and PCA are widely used as source apportionment models. To improve the accuracy of source apportionment and classify the sample data for these models, this study proposes an easy-to-use, high-dimensional EPC algorithm that not only organizes all of the sample data into different groups according to the similarities in pollution characteristics such as pollution sources and concentrations but also simultaneously detects outliers. The main clustering process consists of selecting the first unlabelled point as the cluster centre, then assigning each data point in the sample dataset to its most similar cluster centre according to both the user-defined threshold and the value of similarity function in each iteration, and finally modifying the clusters using a method similar to k-Means. The validity and accuracy of the algorithm are tested using both real and synthetic datasets, which makes the EPC algorithm practical and effective for appropriately classifying sample data for source apportionment models and helpful for better understanding and interpreting the sources of pollution. © 2015 Elsevier Ltd.
Citation statistics:
资源类型: 期刊论文
标识符: http://119.78.100.158/handle/2HF3EXSE/81876
Appears in Collections: 气候变化事实与影响
There are no files associated with this item.
作者单位: School of Information Science and Engineering, Lanzhou University, Lanzhou, China; School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou, China; College of Atmospheric Sciences, Lanzhou University, Lanzhou, China
Recommended Citation:
Chen M,, Wang P,, Chen Q,et al. A clustering algorithm for sample data based on environmental pollution characteristics[J]. Atmospheric Environment,2015-01-01,107