Repository logo
  • English
  • 中文
Log In
Have you forgotten your password?
  1. Home
  2. College of Bioresources and Agriculture / 生物資源暨農學院
  3. Agronomy / 農藝學系
  4. 次世代定序資料模擬軟體的比較
 
  • Details

次世代定序資料模擬軟體的比較

Other Title
Comparison of Next Generation Sequencing Simulators
Journal
作物、環境與生物資訊
Journal Volume
10卷
Journal Issue
1期
Pages
16-33
Date Issued
2013
Author(s)
林書弘(Shu-Hung Lin)
胡凱康(Kae-Kang Hwu)  
劉力瑜(Li-Yu Liu)  
URI
http://www.airitilibrary.com/Publication/Index/18117406-201303-201304230004-201304230004-16-33
http://scholars.lib.ntu.edu.tw/handle/123456789/378078
Abstract
次世代定序技術之高通量定序資料產出,近年來常用於探究基因和轉錄體等基礎研究,也加速了作物之遺傳馴化與分子標誌之輔助育種。正因龐大序列資料之產出,透過適當模擬次世代定序結果,可在實際進行序列分析前,估算所需之覆蓋倍率、序列組裝之必要性與否、與後續開發分子標誌之流程等,擬定更正確且有效率之開發策略,減少後續驗證成本。目前有許多模擬次世代定序資料的模擬軟體,本研究選擇了大腸桿菌基因體DNA序列以及水稻第五條染色體DNA序列作為模擬的資料來源,透過序列組裝及序列比對的過程,進行ART, FlowSim, MetaSim, SimSeq及wgsim等五種模擬軟體,以模擬Roche/454和Illumina定序資料進行評估。MetaSim與wgsim分別是模擬Roche/454與Illumina定序資料所需運算時間最短,為最有效率的軟體。所有的模擬軟體應用於大腸桿菌等較小的基因體時,在序列組裝及序列比對上的表現皆與真實資料相似,但以ART模擬Roche/454較長的序列之結果較接近於真實資料,以SimSeq模擬Illumina序列資料之最大N50長度及覆蓋率與真實資料最為相近。而在水稻等較大的基因體時,大部分的模擬軟體序列組裝後的結果比實際結果樂觀,其中ART模擬之N50、疊連群長度、總裝後的總長度(k-mer = 37-43)較接近真實資料組裝的結果。本研究依據模擬時間、序列組裝及序列比對結果的評估方式,得以客觀比較Roche/454及Illumina定序平台之模擬軟體的優劣,評估結果可提供於作物基因體再定序、孤兒作物之基因體定序、轉錄體研究等參考,在有限的資源下提升研發能量。
The high-throughput next generation sequencing technologies (NGST) have been widely adopted in genomic and transcriptomic researches. NGST has also accelerated the processes of crop domestication and maker-assisted selection in plant breeding. When the budget is limited, it is often inquired to estimate the minimal coverage to yield the sufficient amount of sequencing data and to develop efficient strategies to discover molecular markers based on simulations. Five simulators, including ART, FlowSim, MetaSim, SimSeq and wgsim, had been proposed to mimic the data generated by Roche/454 and Illumina NGST platforms. Using E. coli whole genomes and rice (Oryza sativa) chromosome 5 as the references, we simulated the sequencing results by the five simulators, respectively. The simulators were compared based on the running time, the results of genome assembly, and the results of the sequence alignment. MetaSim and wgsim consumed the shortest running time when simulating Roche/454 and Illumina data, respectively. All simulators yielded similar results of genome assembly and sequence alignment with the real E. coli sequencing data. Among them, ART and SimSeq performed the best in simulating Roche/454 and Illumina, respectively. When simulating rice sequencing data, most simulators yielded more mappable reads and higher coverage rates than reality. ART was the most comparable with the real data. In conclusion, this study proposed the ways to evaluate simulating results for Roche/454 and Illumina sequencing data, which can be consulted for the researches of genome resequencing, de novo sequencing, and transcriptomic studies under limited budget.
Subjects
次世代定序
資料模擬軟體
Next generation sequence technology
Sequence simulators
Type
journal article

臺大位居世界頂尖大學之列,為永久珍藏及向國際展現本校豐碩的研究成果及學術能量,圖書館整合機構典藏(NTUR)與學術庫(AH)不同功能平台,成為臺大學術典藏NTU scholars。期能整合研究能量、促進交流合作、保存學術產出、推廣研究成果。

To permanently archive and promote researcher profiles and scholarly works, Library integrates the services of “NTU Repository” with “Academic Hub” to form NTU Scholars.

總館學科館員 (Main Library)
醫學圖書館學科館員 (Medical Library)
社會科學院辜振甫紀念圖書館學科館員 (Social Sciences Library)

開放取用是從使用者角度提升資訊取用性的社會運動,應用在學術研究上是透過將研究著作公開供使用者自由取閱,以促進學術傳播及因應期刊訂購費用逐年攀升。同時可加速研究發展、提升研究影響力,NTU Scholars即為本校的開放取用典藏(OA Archive)平台。(點選深入了解OA)

  • 請確認所上傳的全文是原創的內容,若該文件包含部分內容的版權非匯入者所有,或由第三方贊助與合作完成,請確認該版權所有者及第三方同意提供此授權。
    Please represent that the submission is your original work, and that you have the right to grant the rights to upload.
  • 若欲上傳已出版的全文電子檔,可使用Open policy finder網站查詢,以確認出版單位之版權政策。
    Please use Open policy finder to find a summary of permissions that are normally given as part of each publisher's copyright transfer agreement.
  • 網站簡介 (Quickstart Guide)
  • 使用手冊 (Instruction Manual)
  • 線上預約服務 (Booking Service)
  • 方案一:臺灣大學計算機中心帳號登入
    (With C&INC Email Account)
  • 方案二:ORCID帳號登入 (With ORCID)
  • 方案一:定期更新ORCID者,以ID匯入 (Search for identifier (ORCID))
  • 方案二:自行建檔 (Default mode Submission)
  • 方案三:學科館員協助匯入 (Email worklist to subject librarians)

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science