A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies
Journal
Nature methods
Journal Volume
19
Journal Issue
12
Date Issued
2022-12
Author(s)
Li, Zilin
Li, Xihao
Zhou, Hufeng
Gaynor, Sheila M
Selvaraj, Margaret Sunitha
Arapoglou, Theodore
Quick, Corbin
Liu, Yaowu
Chen, Han
Sun, Ryan
Dey, Rounak
Arnett, Donna K
Auer, Paul L
Bielak, Lawrence F
Bis, Joshua C
Blackwell, Thomas W
Blangero, John
Boerwinkle, Eric
Bowden, Donald W
Brody, Jennifer A
Cade, Brian E
Conomos, Matthew P
Correa, Adolfo
Cupples, L Adrienne
Curran, Joanne E
de Vries, Paul S
Duggirala, Ravindranath
Franceschini, Nora
Freedman, Barry I
Göring, Harald H H
Guo, Xiuqing
Kalyani, Rita R
Kooperberg, Charles
Kral, Brian G
Lange, Leslie A
Lin, Bridget M
Manichaikul, Ani
Manning, Alisa K
Martin, Lisa W
Mathias, Rasika A
Meigs, James B
Mitchell, Braxton D
Montasser, May E
Morrison, Alanna C
Naseri, Take
O'Connell, Jeffrey R
Palmer, Nicholette D
Peyser, Patricia A
Psaty, Bruce M
Raffield, Laura M
Redline, Susan
Reiner, Alexander P
Reupena, Muagututi'a Sefuiva
Rice, Kenneth M
Rich, Stephen S
Smith, Jennifer A
Taylor, Kent D
Taub, Margaret A
Vasan, Ramachandran S
Weeks, Daniel E
Wilson, James G
Yanek, Lisa R
Zhao, Wei
Rotter, Jerome I
Willer, Cristen J
Natarajan, Pradeep
Peloso, Gina M
Lin, Xihong
Abstract
Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.
Type
journal article
