DC 欄位 | 值 | 語言 |
dc.contributor | 指導教授:陳宏 | - |
dc.contributor | 臺灣大學:數學研究所 | zh_TW |
dc.contributor.author | 牛柏堯 | zh_TW |
dc.contributor.author | Niu, Po-Yao | en |
dc.creator | 牛柏堯 | zh_TW |
dc.creator | Niu, Po-Yao | en |
dc.date | 2014 | - |
dc.date.accessioned | 2014-11-30T06:21:17Z | - |
dc.date.accessioned | 2018-06-28T09:19:03Z | - |
dc.date.available | 2014-11-30T06:21:17Z | - |
dc.date.available | 2018-06-28T09:19:03Z | - |
dc.date.issued | 2014 | - |
dc.identifier.uri | http://ntur.lib.ntu.edu.tw//handle/246246/264033 | - |
dc.description.abstract | 隨著科技的進步,資料蒐集的方式進入了另一個新的階段,龐大且複雜的資料也帶給分析師更大的挑戰。而面對大數據時,資料降維在統計推論上就成了一個非常關鍵的步驟。
主成分分析(PCA)該是目前最廣為人知,對向量資料降維方法;其將高維度資料投射至一個較低維度的空間,並且讓資料中的特徵在新的空間中是彼此不相關的,但實際操作上,PCA 會因為較少的樣本數量和較大的特徵空間而變得不穩定,效度變低。多線性主成分分析(MPCA)則常被用來降低像是矩陣形式資料、或是一般擁有張量結構資料的維度;其將資料的空間視為幾組向量的克羅內克爾積(Kronecker product),以有效運用少量的參數,但是他的降維結果並無法保證特徵之間的相關性。
在此篇論文當中,我們提供了一個透過兩階段來對張量結構資料降
維的方法,稱之為結構主成分分析(SPCA),並且希望藉由此方法結
合PCA 和MPCA 各自的優點。SPCA 在第一階段中使用MPCA 降低原
始資料的維度,並且在第二階段中將資料的張量核心值(socres of core tensors)向量化,再做一次PCA。我們比較了SPCA 和PCA 的漸進效度(asymptotic efficiecy),並且證明在某些條件底下,SPCA 將擁有較佳的漸進效度。我們也使用模擬以及實際的資料檢驗了SPCA、MPCA和PCA 對張量結構降維的實作效果,而結果也顯示SPCA 的確是一個很有潛力的方法。 | zh_TW |
dc.description.abstract | The advances of technologies have created a new era for data collections that the data size and its complexity becomes very challenging to data analysts. Dimension reduction is a key process for statistical inference when
facing huge data set.
Principal component analysis (PCA) may be the most popular dimension reduction method for vector data. PCA projects the data to a lower space and the features become uncorrelated in the new space, but, in reality, it could
be inefficient due to small sample size and large feature dimension. Multilinear principal component analysis (MPCA) has been proposed to reduce the dimension for tensor structure data, including matrix data. MPCA models the space as Kronecker products of vectors to use the parameters in a more efficient way, but it might have correlated scores.
In this thesis, we proposed a two-stage dimension reduction method, called structure PCA (SPCA), aiming to combine the advantages of PCA and MPCA. SPCA employs MPCA on the original data in the first step, and then applies PCA on the vectorized core scores in the second step. The statistical efficiency comparisons between PCA and SPCA are made and SPCA has been proved to have better asymptotic efficiency under some conditions. The performance of PCA, MPCA and SPCA are checked for both simulation and real data and SPCA is shown to be a promising method for huge tensor structure data. | en |
dc.description.tableofcontents | 口試委員會審定書 . . . i
致謝 . . . ii
中文摘要 . . . iii
Abstract . . . iv
Contents . . . v
List of Figures . . . vii
List of Tables . . . viii
1 Introduction . . . 1
2 Method . . . 2
2.1 Original PCA . . . 3
2.2 Structured PCA . . . 4
2.3 Estimating A, B . . . 4
3 Asymptotic . . . 5
3.1 SPCA . . . 7
3.2 Asymptotic efficiency comparison . . . 8
3.2.1 r-spca-mpca v.sr-spca-hosvd . . . 9
3.2.2 r-pca v.sr-spca . . . 10
4 Simulation . . . 13
4.1 Eigen-space Capturing, Structured . . . 14
4.2 Eigen-space Capturing, No Structure . . . 15
4.3 Reconstruction . . . 15
5 Real Data Implementation . . . 16
5.1 Ribosome . . . 17
References . . . 20
Appendices . . . 21
A Proofs and Derivations . . . 21
A.1 The Derivation of Derivatives of r w.r.t vec(Sy) . . . 21
A.2 The Form of The Differential of P multiplies r . . . 23
A.3 The explicit form of SigmaN . . . 25
A.4 The asymptotic variance . . . 26
A.4.1 Another Way to Apply Delta Method . . . 26
A.4.2 Asymptotic Variance of r-pca . . . 27
A.4.3 Asymptotic Variance of r-spca . . . 28
A.4.4 Asymptotic Variance of r-spca-mpca . . . 30
A.4.5 Uncorrelated Assumption on U . . . 32
B Simulation Settings . . . 35
B.1 Simulation Settings and Results in Section 4.1 . . . 35
B.2 Simulation Settings and Results in Section 4.2 . . . 36
B.3 Simulation Settings and Results in Section 4.3 . . . 37 | zh_TW |
dc.format.extent | 2107790 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.language | en_US | - |
dc.rights | 論文公開時間:2014/08/01 | - |
dc.rights | 論文使用權限:同意有償授權(權利金給回饋本人) | - |
dc.subject | 主成分分析 | zh_TW |
dc.subject | 多線性主成分分析 | zh_TW |
dc.subject | 結構主成分分析 | zh_TW |
dc.subject | 張量 | zh_TW |
dc.subject | 漸進 | zh_TW |
dc.subject | 效度 | zh_TW |
dc.title | 張量結構數據的降維 | zh_TW |
dc.title | Dimension Reduction for Tensor Structure Data | en |
dc.type | thesis | en |
dc.identifier.uri.fulltext | http://ntur.lib.ntu.edu.tw/bitstream/246246/264033/1/ntu-103-R99221002-1.pdf | - |
item.openairecristype | http://purl.org/coar/resource_type/c_46ec | - |
item.openairetype | thesis | - |
item.grantfulltext | open | - |
item.cerifentitytype | Publications | - |
item.fulltext | with fulltext | - |
顯示於: | 數學系
|