A Nucleotide-Position-Based Data Format for Fast Variant Calling and Its Hardware Analyzer Design
Journal
BioCAS 2022 - IEEE Biomedical Circuits and Systems Conference: Intelligent Biomedical Systems for a Better Future, Proceedings
ISBN
9781665469173
Date Issued
2022-01-01
Author(s)
Abstract
In this paper, we propose a file format, vBAM, to improve the performance of variant calling tasks. The vBAM format removes data irrelevant to variant calling and compresses base/quality information by positions to reduce data bits. Thus, the vBAM format takes shorter variant calling time and is smaller in size when compared to the conventional BAM/pileup files. Our C++ software supports BAM to vBAM conversion, vBAM decoding, and variant calling. We also implement an accelerator to shorten the computing time of decoding and calling stages. The hardware can achieve at least a 7.2X speed-up when compared to its software counterpart.
Subjects
data compression | DNA sequence | hardware accelerator | next-generation sequencing | variant calling
Type
conference paper
