Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Organizations
  • Researchers
  • Research Outputs
  • Explore by
    • Organizations
    • Researchers
    • Research Outputs
  • Academic & Publications
  • Sign in
  • 中文
  • English
  1. NTU Scholars
  2. 公共衛生學院
  3. 流行病學與預防醫學研究所
Please use this identifier to cite or link to this item: https://scholars.lib.ntu.edu.tw/handle/123456789/112363
DC FieldValueLanguage
dc.contributor陳素雲zh-TW
dc.contributor臺灣大學:流行病學研究所zh-TW
dc.contributor.author陳佩君zh-TW
dc.contributor.authorChen, Pei-Chunen
dc.creator陳佩君zh-TW
dc.creatorChen, Pei-Chunen
dc.date2008en
dc.date.accessioned2010-05-05T10:51:41Z-
dc.date.accessioned2018-06-29T17:52:45Z-
dc.date.available2010-05-05T10:51:41Z-
dc.date.available2018-06-29T17:52:45Z-
dc.date.issued2008-
dc.identifier.otherU0001-2701200814473600en
dc.identifier.urihttp://ntur.lib.ntu.edu.tw//handle/246246/180663-
dc.description.abstract本論文主要分為兩部分。在第一部份中,著重於利用編碼(coding)找出一個低維線性分類子空間(low-dimensional linear discriminant feature subspace)的方法,並探討不同編碼之間的等價性質(equivalence)。透過編碼的方法可以將類別(class label)轉換成多維反應量(multiresponse),將此多維反應量與核化資料(kernelized data)進行迴歸分析,再進一步利用迴歸係數得到低維線性分類子空間。此子空間可結合任意的線性分類法,使計算較為簡潔快速。在這一部份中也證明,任意編碼產生的多維反應量都會生成同樣的低維線性分類子空間,因此任意的線性分類法都會得到相同的分類結果。實際資料分類的結果顯示,本文提出的分類方法與LIBSVM比較,具有相近的正確率,但是需要較少的分類時間。第二部分中,本文提出了一個利用支撐向量迴歸(support vector regression)進行基因選取(gene selection)的方法。目前根據微陣列資料(microarray data)作基因選取的方法都將每一片生物晶片視為相同。然而,生物晶片也許來自於不同疾病狀態的病人身上,因此與疾病的相關也不全然相同。所以應當給予生物晶片不同的權重來表示這些生物晶片與疾病之間的相關性。而這些權重可以由支撐向量迴歸估計得來。將這些加權過後的表現(weighted expressions)相加後得到的數值,可以用來決定哪些基因是有顯著意義的基因(significant genes)。我們使用白血病(leukemia)與結腸癌(colon cancer)的資料作分析,並比較其他基因選取的方法所得之正確率。結果顯示,本文提出的基因選取方法可以找出有顯著意義的基因。zh-TW
dc.description.abstractThis thesis contains two major themes. One is the multiclass support vector machines and the other is the support vector regression for gene selection. In the first part, we propose a regression approach for multiclass support vector classification. We introduce some existing coding schemes into the support vector classification by coding the class labels into multivariate responses. Regression of these multivariate responses on kernelized input data is used to extract a low-dimensional featureubspace for discriminant purpose. We unify these coding schemes by showing that they are equivalent in the sense of leading to the same low-dimensional discriminant feature subspace. Classification is then carried out in this low-dimensional subspace using a linear discriminant algorithm, which can be any reasonable choice. The regression approach for extracting low-dimensionaliscriminant subspace combined with user-specified linearlgorithm can team up into a simple but yet powerful toolkit for multiclass support vector classification. Issues of encoding, decoding and the notions of equivalence of codes are discussed. Experimental results, including prediction ability and CPU time, show that our approach is a competent alternative for the multiclass support vector machine problem.n the second part, we propose a support vector regressionpproach for gene selection and use the selected genes for disease classification. Current gene selection methods based on microarray data have treated each individual subject with equal weight to the disease of interest. However, tissues collected from different patients can be from different disease stages and may have different strength of association with the disease. To reflecthis circumstance, our proposed method will take into account the subject variation by assigning different weights to subjects. The weights are calculated via support vector regression. Then significant genes are selected based on the cumulative sum of weighted expressions. The proposed gene selection procedure isllustrated and evaluated using the acute leukemia and colon cancer data. The results and performance are compared with four other approaches in terms of classification accuracies.en
dc.description.tableofcontents1 Introduction 1 Preliminaries: Support Vector Machines 3.1 Linearly separable case . . . . . . . . . . . . . . . . . . . 3.2 Linearly non-separable case . . . . . . . . . . . . . . . . 6.3 Nonlinear extension by kernel trick . . . . . . . . . . . . 8.4 Smooth support vector machine . . . . . . . . . . . . . . 9.5 Extension to multiclass classification problem . . . . . . 11.5.1 One-against-rest and one-against-one . . . . . . . 12.5.2 Single machine approach . . . . . . . . . . . . . . 13.6 Support vector regression . . . . . . . . . . . . . . . . . . 18.7 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.7.1 LIBSVM. . . . . . . . . . . . . . . . . . . . . . . 21.7.2 SSVM toolbox . . . . . . . . . . . . . . . . . . . . 21 Classification by Coding and Multiresponse Regression 23.1 Regression framework: linear and kernel generalization . 24.2 Regularized least-squares support vector regression . . . 27.3 Decoding and classification rules . . . . . . . . . . . . . . 28.4 Encoding and equivalence class of codes . . . . . . . . . 30.4.1 Coding and scoring schemes . . . . . . . . . . . . 30.4.2 Equivalence class of codes . . . . . . . . . . . . . 33 Application to Benchmark Data Sets 37.1 Benchmark data sets . . . . . . . . . . . . . . . . . . . . 37.2 Comparisons of coding schemes . . . . . . . . . . . . . . 42.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 45 Gene Selection with Support Vector Regression 51.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Current methods . . . . . . . . . . . . . . . . . . . . . . 53.2.1 SVM-based recursive feature elimination . . . . . 54.2.2 Incremental forward feature selection . . . . . . . 55.2.3 Bayesian variable selection . . . . . . . . . . . . . 55.2.4 Bayesian model average . . . . . . . . . . . . . . 56.3 Proposed gene selection method . . . . . . . . . . . . . . 57.4 Empirical data analysis . . . . . . . . . . . . . . . . . . . 61.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . 67 Discussions and Future Directions 69eferences 75en
dc.formatapplication/pdfen
dc.format.extent494866 bytes-
dc.format.mimetypeapplication/pdf-
dc.languageenen
dc.language.isoen_US-
dc.subject編碼zh-TW
dc.subject基因選取zh-TW
dc.subject核化zh-TW
dc.subject線性分類子空間zh-TW
dc.subject微陣列資料支撐向量機制zh-TW
dc.subject支撐向量迴歸zh-TW
dc.subjectcodingen
dc.subjectgene selectionen
dc.subjectkernelen
dc.subjectlinear discriminant subspaceen
dc.subjectmachine learningen
dc.subjectmicroarray data analysisen
dc.subjectsupport vector machineen
dc.subjectsupport vector regressionen
dc.title支撐向量機制:以編碼處理分類問題並利用迴歸模式進行基因選取zh-TW
dc.titleSupport Vector Machines: Classification with Coding and Regression for Gene Selectionen
dc.typethesisen
dc.identifier.uri.fulltexthttp://ntur.lib.ntu.edu.tw/bitstream/246246/180663/1/ntu-97-D93842005-1.pdf-
item.grantfulltextopen-
item.openairetypethesis-
item.fulltextwith fulltext-
item.openairecristypehttp://purl.org/coar/resource_type/c_46ec-
item.languageiso639-1en_US-
item.cerifentitytypePublications-
Appears in Collections:流行病學與預防醫學研究所
Files in This Item:
File Description SizeFormat
ntu-97-D93842005-1.pdf23.32 kBAdobe PDFView/Open
Show simple item record

Page view(s)

3
checked on Aug 20, 2020

Download(s)

1
checked on Aug 20, 2020

Google ScholarTM

Check

Related Items in TAIR


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

臺大位居世界頂尖大學之列,為永久珍藏及向國際展現本校豐碩的研究成果及學術能量,圖書館整合機構典藏(NTUR)與學術庫(AH)不同功能平台,成為臺大學術典藏NTU scholars。期能整合研究能量、促進交流合作、保存學術產出、推廣研究成果。

To permanently archive and promote researcher profiles and scholarly works, Library integrates the services of “NTU Repository” with “Academic Hub” to form NTU Scholars.

總館學科館員 (Main Library)
醫學圖書館學科館員 (Medical Library)
社會科學院辜振甫紀念圖書館學科館員 (Social Sciences Library)

開放取用是從使用者角度提升資訊取用性的社會運動,應用在學術研究上是透過將研究著作公開供使用者自由取閱,以促進學術傳播及因應期刊訂購費用逐年攀升。同時可加速研究發展、提升研究影響力,NTU Scholars即為本校的開放取用典藏(OA Archive)平台。(點選深入了解OA)

  • 請確認所上傳的全文是原創的內容,若該文件包含部分內容的版權非匯入者所有,或由第三方贊助與合作完成,請確認該版權所有者及第三方同意提供此授權。
    Please represent that the submission is your original work, and that you have the right to grant the rights to upload.
  • 若欲上傳已出版的全文電子檔,可使用Sherpa Romeo網站查詢,以確認出版單位之版權政策。
    Please use Sherpa Romeo to find a summary of permissions that are normally given as part of each publisher's copyright transfer agreement.
  • 網站簡介 (Quickstart Guide)
  • 使用手冊 (Instruction Manual)
  • 線上預約服務 (Booking Service)
  • 方案一:臺灣大學計算機中心帳號登入
    (With C&INC Email Account)
  • 方案二:ORCID帳號登入 (With ORCID)
  • 方案一:定期更新ORCID者,以ID匯入 (Search for identifier (ORCID))
  • 方案二:自行建檔 (Default mode Submission)
  • 方案三:學科館員協助匯入 (Email worklist to subject librarians)
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback